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Preface 


This book is designed to serve as a basic text of modern algebra at the undergraduate 
level. Modem mathematics facilitates unification of different areas of mathematics. 
It is characterized by its emphasis on the systematic study of a number of abstract 
mathematical structures. Modern algebra provides a language for almost all disci- 
plines in contemporary mathematics. 

This book introduces the basic language of modern algebra through a study of 
groups, group actions, rings, fields, vector spaces, modules, algebraic numbers, etc. 
The term Modern Algebra (or Abstract Algebra) is used to distinguish this area 
from classical algebra. Classical algebra grew over thousands of years. On the other 
hand, modern algebra began as late as 1770. Modern algebra is used in many ar- 
eas of mathematics and other sciences. For example, it is widely used in algebraic 
topology of which the main objective is to solve topological and geometrical prob- 
lems by using algebraic objects. Category theory plays an important role in this 
respect. Grigory Perelman solved the Poincare conjecture by using the tools of 
modern algebra and other modern mathematical concepts (beyond the scope of the 
discussion in this book). A study of algebraic numbers includes studies of various 
number rings which generalize the set of integers. It also offers a natural inspira- 
tion to reinterpret the results of classical algebra and number theory and provides 
a scope of greater unity and generality. Algebraic number theory provides impor- 
tant tools to solve many problems in mathematics. For example, Andrew Wiles 
proved Fermat’s last theorem by using algebraic number theory along with other 
theories (not discussed in this book). Moreover, an attempt has been made to in- 
tegrate the classical materials of elementary number theory with modern algebra 
with an eye to apply them in different disciplines, specially in cryptography. Some 
applications unify computer science with mainstream mathematics. It dispels, at 
least, partly a general feeling that much of abstract algebra is of little practical 
value. The main purpose of this book is to give an accessible presentation to its 
readers. The materials discussed here have appeared elsewhere. Our contribution 
is the selection of the materials and their presentation. The title of the book sug- 
gests the scope of the book, which is expanded over 12 chapters and three appen- 
dices. 
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Chapter 1 studies basic concepts of set theory and properties of integers, which 
are used throughout the book and in many other disciplines. Set theory occupies a 
very prominent place in modern science. There are two general approaches to set 
theory. The first one is called “naive set theory” initiated by Georg Cantor around 
1870; the second one is called “axiomatic set theory” originated by E. Zermelo in 
1908 and modified by A. Fraenkel and T. Skolem. This chapter develops a naive 
set theory, which is a non-formalized theory by using a natural language to describe 
sets and their basic properties. For a precise description of many notions of mod- 
ern algebra and also for mathematical reasoning, the concepts of relations, Zorn’s 
lemma, mappings (functions), cardinality of sets are very important. They form the 
basics of set theory and are discussed in this chapter. The set of integers plays an 
important role in the development of science, engineering, technology and human 
civilization. In this chapter, some basic concepts and properties of integers, such as 
Peano’s axioms leading to the principle of mathematical induction, well-ordering 
principle, division algorithm, greatest common divisors, prime numbers, fundamen- 
tal theorem of arithmetic, congruences on integers, etc., are also discussed. Further 
studies on number theory are given in Chap. 10. 

Chapter 2 gives an introduction to the group theory. This concept is used in sub- 
sequent chapters. Groups serve as one of the fundamental building blocks for the 
subject called today modern algebra. The theory of groups began with the work 
of J.F. Fagrange (1736-1813) and E. Galois (1811-1832). At that time, mathemati- 
cians worked with groups of transformations. These were sets of mappings, that, un- 
der composition, possessed certain properties. Mathematicians, such as Felix Klein 
(1849-1925), adopted the idea of groups to unify different areas of geometry. In 
1870, F. Kronecker (1823-1891) gave a set of postulates for a group. Earlier def- 
initions of groups were generalized to the present concept of an abstract group in 
the first decade of the twentieth century, which was defined by a set of axioms. In 
this chapter, we make an introductory study of groups with geometrical applications 
along with a discussion of free abelian groups and structure theorem for finitely 
generated abelian groups. Moreover, semigroups, homology groups, cohomology 
groups, topological groups, Fie groups, Hopf groups, and fundamental groups are 
also studied here. 

Chapter 3 discusses actions of semigroups, groups, topological groups, and Fie 
groups. Each element of a group determines a permutation on a set under a group 
action. For a topological group action on a topological space, this permutation is a 
homeomorphism and for a Fie group action on a differentiable manifold it is a dif- 
feomorphism. Group actions are used in the proofs of the counting principle, Cay- 
ley’s Theorem, Cauchy’s Theorem and Sylow Theorems for finite groups. Counting 
principle is used to determine the structure of a group of prime power order. These 
groups arise in the Sylow Theorems and in the description of finite abelian groups. 
Orbit spaces obtained by topological group actions, discussed in this chapter, are 
very important in topology and geometry. For example, n -dimensional real and 
complex projective spaces are obtained as orbit spaces. Finally, semigroup actions 
applied to theoretical computer science, yield state machines which unify computer 
science with mainstream mathematics. 
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Rings also serve as a fundamental building blocks for modem algebra. Chapter 4 
introduces the concept of rings, another fundamental concept in the study of mod- 
ern algebra. A “group” is endowed with only one binary operation while a “ring” is 
endowed with two binary operations connected by some interrelations. Fields form 
a very important class of rings. The concept of rings arose through the attempts to 
prove Fermat’s last theorem and was initiated by Richard Dedekind (1831-1916) 
around 1880. David Hilbert (1862-1943) coined the term “ring”. Emmy Noether 
(1882-1935) developed the theory of rings under his guidance. A very particular 
but important type of rings is known as commutative rings that play an important 
role in algebraic number theory and algebraic geometry. Further, non-commutative 
rings are used in non-commutative geometry and quantum groups. In this chapter, 
Wedderbum theorem on finite division rings, and some special rings, such as rings 
of power series, rings of polynomials, rings of continuous functions, rings of endo- 
morphisms of abelian groups and Boolean rings are also studied. 

Chapter 5 continues the study of theory of rings, and introduces the concept of 
ideals which generalize many important properties of integers. Ideals and homomor- 
phisms of rings are closely related. Like normal subgroups in the theory of groups, 
ideals play an analogous role in the study of rings. The real significance of ideals 
in a ring is that they enable us to construct other rings which are associated with 
the first in a natural way. Commutative rings and their ideals are closely related. 
Their relations develop ring theory and are applied in many areas of mathematics, 
such as number theory, algebraic geometry, topology and functional analysis. In this 
chapter, basic properties of ideals are discussed and explained with interesting ex- 
amples. Ideals of rings of continuous functions and Chinese remainder theorem for 
rings with their applications are also studied. Finally, applications of ideals to alge- 
braic geometry with Hilbert’s Nullstellensatz theorem, and the Zariski topology are 
discussed and certain connections among algebra, geometry and topology are given. 

Chapter 6 extends the concepts of divisibility, greatest common divisor, least 
common multiple, division algorithm and fundamental theorem of arithmetic for 
integers with the help of theory of ideals to the corresponding concepts for rings. 
The main aim of this chapter is to study the problem of factoring the elements of 
an integral domain as products of irreducible elements. This chapter also caters 
to the study of the polynomial rings over a certain class of important rings and 
proves the Eisenstein irreducibility criterion, Gauss Lemma and related topics. Our 
study culminates in proving the Gauss Theorem which provides an extensive class 
of uniquely factorizable domains. 

Chapter 7 continues to develop the theory of rings and studies chain conditions 
for ideals of a ring. The motivation came from an interesting property of the ring 
of integers Z: that its every ascending chain of ideals terminates. This interesting 
property of Z was first recognized by the German mathematician Emmy Noether 
(1882-1935). This property leads to the concept of Noetherian rings, named after 
Noether. On the other hand, Emil Artin (1898-1962) showed that there are some 
rings in which every descending chain of ideals terminates. Such rings are called 
Artinian rings in honor of Emil Artin. This chapter studies special classes of rings, 
such as Noetherian rings and Artinian rings and obtains deeper results on ideal the- 
ory. This chapter further introduces a Noetherian domain and an Artinian domain 
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and establishes their interesting properties. Hilbert’s Basis Theorem, which gives an 
extensive class of Noetherian rings, is also proved in this chapter. Its application to 
algebraic geometry is also discussed. This study culminates in rings with descending 
chain condition for ideals, which determines their ideal structure. 

Chapter 8 introduces another algebraic system, called vector spaces (linear 
spaces) interlinking both internal and external operations. In this chapter, vector 
spaces and their closely related fundamental concepts, such as linear independence, 
basis, dimension, linear transformation and its matrix representation, eigenvalue, 
inner product space, etc., are presented. Such concepts form an integral part of lin- 
ear algebra. Vector spaces have multi-faceted applications. Such spaces over finite 
fields play an important role in computer science, coding theory, design of exper- 
iments and combinatorics. Vector spaces over the infinite held Q of rationals are 
important in number theory and design of experiments while vector spaces over C 
are essential for the study of eigenvalues. As the concept of a vector provides a ge- 
ometric motivation, vector spaces facilitate the study of many areas of mathematics 
and integrate the abstract algebraic concepts with the geometric ideas. 

Chapter 9 initiates module theory, which is one of the most important topics in 
modern algebra. It is a generalization of an abelian group (which is a module over 
the ring of integers Z) and also a natural generalization of a vector space (which 
is a module over a division ring or over a held). Many results of vector spaces are 
generalized in some special classes of modules, such as free modules and finitely 
generated modules over principal ideal domains. Modules are closely related to the 
representation theory of groups. One of the basic concepts which accelerates the 
study of commutative algebra is module theory, as modules play the central role in 
commutative algebra. Modules are also widely used in structure theory of finitely 
generated abelian groups, finite abelian groups and rings, homological algebra, and 
algebraic topology. In this chapter, we study the basic properties of modules. We 
also consider modules of special classes, such as free modules, modules over prin- 
cipal ideal domains along with structure theorems, exact sequences of modules and 
their homomorphisms, Noetherian and Artinian modules, homology and cohomol- 
ogy modules. Our study culminates in a discussion on the topology of the spectrum 
of modules and rings with special reference to the Zariski topology. 

Chapter 10 discusses some more interesting properties of integers, in particular, 
properties of prime numbers and primality testing by using the tools of modem alge- 
bra, which are not studied in Chap. 1 . In addition, we study the applications of num- 
ber theory, particularly those directed towards theoretical computer science. Number 
theory has been used in many ways to devise algorithms for efficient computer and 
for computer operations with large integers. Both algebra and number theory play 
an increasingly significant role in computing and communication, as evidenced by 
the striking applications of these subjects to the fields of coding theory and cryptog- 
raphy. The motivation of this chapter is to provide an introduction to the algebraic 
aspects of number theory, mainly the study of development of the theory of prime 
numbers with an emphasis on algorithms and applications, necessary for studying 
cryptography to be discussed in Chap. 12. In this chapter, we start with the introduc- 
tion to prime numbers with a brief history. We provide several different proofs of 
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the celebrated Theorem of Euclid, stating that there exist infinitely many primes. We 
further discuss Fermat number, Mersenne numbers, Carmichael numbers, quadratic 
reciprocity, multiplicative functions, such as Euler 0-function, number of divisor 
functions, sum of divisor functions, etc. This chapter ends with a discussion of pri- 
mality testing both deterministic and probabilistic, such as Solovay-Strassen and 
Miller-Rabin probabilistic primality tests. 

Chapter 1 1 introduces algebraic number theory which developed through the at- 
tempts of mathematicians to prove Fermat’s last theorem. An algebraic number is 
a complex number, which is algebraic over the field Q of rational numbers. An al- 
gebraic number field is a subfield of the field C of complex numbers, which is a 
finite field extension of the field Q, and is obtained from Q by adjoining a finite 
number of algebraic elements. The concepts of algebraic numbers, algebraic inte- 
gers, Gaussian integers, algebraic number fields and quadratic fields are introduced 
in this chapter after a short discussion on general properties of field extension and 
finite fields. There are several proofs of fundamental theorem of algebra. It is proved 
in this chapter by using homotopy (discussed in Chap. 2). Moreover, countability of 
algebraic numbers, existence of transcendental numbers, impossibility of duplica- 
tion of a general cube and that of trisection of a general angle are shown in this 
chapter. 

Chapter 12 presents applications and initiates a study of cryptography. In the 
modern busy digital world, the word “cryptography” is well known to many of us. 
Everyday, knowingly or unknowingly, in many places we use different techniques 
of cryptography. Starting from the logging on a PC, sending e-mails, withdraw- 
ing money from ATM by using a PIN code, operating the locker at a bank with 
the help of a designated person from the bank, sending message by using a mobile 
phone, buying things through the internet by using a credit card, transferring money 
digitally from one account to another over internet, we are applying cryptography 
everywhere. If we observe carefully, we see that in every case we are required to 
hide some information to transfer information secretly. So, intuitively, we can guess 
that cryptography has something to do with security. So, intuitively, we can guess 
that cryptography and secrecy have a close connection. Naturally, the questions that 
come to our mind are: What is cryptography? How is it that it is important to our 
daily life? In this chapter, we introduce cryptography and provide a brief overview 
of the subject and discuss the basic goals of cryptography and present the subject, 
both intuitively and mathematically. More precisely, various cryptographic notions 
starting from the historical ciphers to modern cryptographic notions like public key 
encryption schemes, signature schemes, secret sharing schemes, oblivious transfer, 
etc., by using mathematical tools mainly based on modem algebra are explained. Fi- 
nally, the implementation issues of three public key cryptographic schemes, namely 
RSA, ElGamal, and Rabin by using the open source software SAGE are discussed. 

Appendix A studies some interesting properties of semirings. Semiring theory is 
a common generalization of the theory of associative rings and theory of distribu- 
tive lattices. Because of wide applications of results of semiring theory to different 
branches of computer science, it has now become necessary to study structural re- 
sults on semirings. This chapter gives some such structural results. 
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Appendix B discusses category theory initiated by S. Eilenberg and S. Mac Lane 
during 1942-1945, to provide a technique for unifying certain concepts. In general, 
the pedagogical methods compartmentalize mathematics into its different branches 
without emphasizing their interconnections. But the category theory provides a con- 
venient language to tie together several notions and existing results of different areas 
of mathematics. This language is conveyed through the concepts of categories, func- 
tors, natural transformations, which form the basics of category theory and provide 
tools to shift problems of one branch of mathematics to another branch to have a 
better chance for solution. For example, the Brouwer fixed-point theorem is proved 
here with the help of algebraic objects. Moreover, extension problems and classifi- 
cation of topological spaces are discussed in this chapter. 

Appendix C gives a brief historical note highlighting contributions of some 
mathematicians to modern algebra and its closely related topics. The list includes 
a few names only, such as the names of Leonhard Euler (1707-1783), Joseph 
Louis Lagrange (1736-1813), Carl Friedrich Gauss (1777-1855), Augustin Louis 
Cauchy (1789-1857), Niels Henrik Abel (1802-1829), C.G.J. Jacobi (1804-1851), 
H. Grassmann (1809-1877), Evariste Galois (1811-1832), Arthur Cayley (1821— 
1895), Leopold Kronecker (1823-1891), Bernhard Riemann (1826-1866), Richard 
Dedekind (1831-1926), Peter Ludvig Mejdell Sylow (1832-1918), Camille Jor- 
dan (1838-1922), Sophus Lie (1842-1899), Georg Cantor (1845-1918), C. Fe- 
lix Klein (1849-1925), Henri Poincare (1854-1912), David Hilbert (1862-1943), 
Amalie Emmy Noether (1882-1935), MecLagan Wedderbum (1882-1948), Emil 
Artin (1898-1962) and Oscar Zariski (1899-1986). 

The authors express their sincere thanks to Springer for publishing this book. The 
authors are very thankful to many individuals, our postgraduate and Ph.D. students, 
who have helped in proof reading the book. Our thanks are due to the institute 
IMBIC, Kolkata, for kind support towards the manuscript development work of this 
book. Finally, we acknowledge, with heartfelt thanks, the patience and sacrifice of 
our long-suffering family, specially Minati, Sahibopriya, and little Avipriyo. 
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Chapter 1 

Prerequisites: Basics of Set Theory and Integers 


Set theory occupies a very prominent place in modem science. There are two gen- 
eral approaches to set theory. The first is called “naive set theory” and the second is 
called “axiomatic set theory”, also known as “Zermelo-Fraenkel set theory”. These 
two approaches differ in a number of ways. The “naive set theory” was initiated by 
the German mathematician Georg Cantor (1845-1918) around 1870. According to 
Cantor, “a set is any collection of definite, distinguishable objects of our intuition 
or of our intellect to be conceived as a whole”. Each object of a set is called an ele- 
ment or member of the set. The phrase “objects of our intuition or of our intellect” 
offers freedom to choose the objects in forming a set and thus a set is completely 
determined by its objects. He developed the mathematical theory of sets to study the 
real numbers and Fourier series. Richard Dedekind (1831-1916) also enriched set 
theory during 1870s. However, Cantor’s set theory was not immediately accepted 
by his contemporary mathematicians. His definition of a set leads to contradictions 
and logical paradoxes. The most well known of them is the Russell paradox given 
in 1918 by Bertrand Russell (1872-1970). Such paradoxes led to axiomatize Can- 
tor’s intuitive set theory giving the birth of Zermelo-Fraenkel Set Theory. Some 
authors prefer to call it Zermelo-Fraenkel-Skolem Set Theory. The reason is that it 
is the theory of E. Zermelo of 1908 modified by both A. Fraenkel and T. Skolem. 
In 1910 Hilbert wrote “set theory is that mathematical discipline which today occu- 
pies an outstanding role in our science and radiates its powerful influence into all 
branches of mathematics”. In this chapter, we develop a naive set theory, which is 
a non-formalized theory using a natural language to describe sets and their basic 
properties. For precise description of many notions of modern algebra and also for 
mathematical reasoning, the concepts of relations, Zorn’s Lemma, mappings (func- 
tions), cardinality of sets, are very important. They form the basics of set theory and 
are discussed in this chapter. Many concrete examples are based on them. The set 
of integers plays an important role in the development of science, technology, and 
human civilization. Number theory, in a general sense, is the study of set of integers. 
It has long been one of the favorite subjects not only for the students but also for 
many others. In this chapter, some basic concepts and properties of integers, such as 
Peano’s axioms, well-ordering principle, division algorithm, greatest common divi- 
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sors, prime numbers, fundamental theorem of arithmetic and modular arithmetic are 
studied. More study on number theory is in Chap. 10. 


1.1 Sets: Introductory Concepts 

The concept of 4 set’ is very important in all branches of mathematics. We come 
across certain terms or concepts whose meanings need no explanation. Such terms 
are called undefined terms and are considered as primitive concepts. If one defines 
the term ‘set’ as ‘a set is a well defined collection of objects’, then the meaning of 
collection is not clear. One may define ‘a collection’ as ‘an aggregate’ of objects. 
What is the meaning of ‘aggregate’? As our language is finite, other synonyms, 
such as ‘class’, ‘family’ etc., will exhaust. Mathematicians accept that there are 
undefined terms and ‘set’ shall be such an undefined term. But we accept the familiar 
expressions, such as ‘set of all integers’, ‘set of all natural numbers’, ‘set of all 
rational numbers’, ‘set of all real numbers’ etc. 

We shall neither attempt to give a formal definition of a set nor try to lay the 
groundwork for an axiomatic theory of sets. Instead we shall take the operational 
and intuitive approach to define a set. A set is a well defined collection of distin- 
guishable objects. 

The term ‘ well defined ’ specifies that it can be determined whether or not certain 
objects belong to the set in question. In most of our applications we deal with rather 
specific objects, and the nebulous notion of a set, in these, emerge as something 

quite recognizable. We usually denote sets by capital letters, such as A, B,C, 

The objects of a set are called the elements or members of the set and are usually 

denoted by small letters, such as a,b,c , Given a set A we use the notation 

throughout ‘a e A’ to indicate that an element a is a member of A and this is read 
as ‘ a is an element of A’ or ‘ a belongs to A’; and 'a £ A’ to indicate that the 
element a is not a member of A and this is read as ‘ a is not an element of A’ or ‘ a 
does not belong to A’. Since a set is uniquely determined by its elements, we may 
describe a set either by a characterizing property of the elements or by listing the 
elements. The standard way to describe a set by listing elements is to list elements 
of the set separated by commas, in braces. Thus a set A = {a, b, c} indicates that a , 
b, c are the only elements of A and nothing else. If B is a set which consists of a , 
b, c and possibly more, then notationally, B = {a, b, c, . . .}. On the other hand, a set 
consisting of a single element x is sometimes called singleton x, denoted by {x}. 
By a statement , we mean a sentence about specific objects such that it has a truth 
value of either true or false but not both. If a set A is described by a characterizing 
property P(x) of its elements x, the brace notation {x : P(x)} or {x \ P(x)} is also 
often used, and is read as ‘the set of all x such that the statement P(x) about x is 
true.’ For example, A = {x : x is an even positive integer < 10}. 

Throughout this book, we use the following notations: 

N + (or Z + ) is the set of all positive integers (zero is excluded); 

N is the set of all non-negative integers (zero is included); 
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Z is the set of all integers (positive, negative, and zero); 

Q is the set of all rational numbers (i.e., numbers which can be expressed as 
quotients m/ft of integers, where «^0); 

Q + is the set of all positive rational numbers; 

R is the set of all real numbers; 

R+ is the set of all positive real numbers; 

C is the set of all complex numbers; 

3 denotes ‘there exists’, V denotes ‘for all’. 

An implication is a statement of the form 6 P implies 2’ or ‘if P, then 2’ written 
as P =>► 2- An implication is false if P is true and Q is false; it is true in all other 
cases. A logical equivalence or biconditional is a statement of the form ‘ P implies 
2 and 2 implies P’ and abbreviated to if and only if (iff) and written as P Q. 
Thus P 2 is exactly true when both P and Q are either true or false. 

Two elements a and b of a set S are said to be equal, denoted by a = b iff 
they are the same elements, otherwise we denote a / b. Two sets X and Y are 
said to be equal, denoted by X = Y iff they have the same elements. For example, 
{2, 4, 6, 8} = {2x : x = 1, 2, 3, 4}. 

We introduce a special set which we call the ‘ empty (or vacuous or null ) set’ ; 
denoted by 0, which we think of as ‘the set having no elements’. The empty set is 
only a convention. It is important in the context of logical completeness. As 0 is 
thought of as ‘the set with no elements’ the convention is that for any element jc, the 
relation x e 0 does not hold. Thus {x : x is an even integer such that x 2 = 2} = 0. If 
a set A is such that A ± 0, then A is called a non-empty (or non-vacuous or non- 
null) set and its null subset 0 a is {x e A : x / x}. If X and Y are two sets such that 
every element of X is also an element of T, then X is called subset of Y, denoted 
by icy (or simply by X c Y), which reads X is contained in Y (or equivalently, 
FCI or simply Y D X\ which reads Y contains X). For example, N c Z and 
Z c Q. Clearly, X c X for every set X. 

Proposition 1.1.1 (i) X = Y if and only ifX^Y and FCI; 

(ii) All null subsets are equal. 

Proof (i) It follows from the definition of equality of sets. 

(ii) Let A and B be any two sets and 0a, 0b be the respective null subsets. If 
0 a ^ 05 is false, then there exists at least one element in 0a which is not in 0b ; 
but this is impossible. Consequently, 0 a ^ 0b- Similarly, 0^ c 0 A . Hence by (i) 
0 a = 0b and thus all null subsets are equal. □ 

Remark There is one and only one null set 0, and is contained in every set. 

If a set If is a subset of a set Y and there exists at least one element of Y which 
is not an element of X , then X is called a proper subset of Y, denoted by X cy. If 
a set X is not a subset of Y, we write X Y. 

Proposition 1.1.2 A set X of n elements has 2 n subsets. 
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Proof The number of the subsets having r (<n) elements out of n elements of X is 
the number of combinations of n elements taken r at a time, i.e., 

n C r ~ n\/r\(n — r)\. 


Hence the number of all subsets including X and 0 is 

n 

Y n C r = n C 0 + "Ci + • • • + n C n = (1 + 1)» = 2". 

^ n 

r=0 U 

To avoid extraneos elements, we assume that all elements under consideration 
belong to some fixed but arbitrary set, called the universal set , denoted by U. 

Given two sets, we can combine them to form new sets. We can carry out the 
same procedure for any number of sets, finite or infinite. We do so for two sets first, 
because it leads to the general construction. 

Given two sets A and B , we can form a set that consists of exactly all the elements 
of A together with all the elements of B . This is called the union of A and B . 

Definition 1.1.1 The union (or join) of two sets A and B , written as A U B, is the 
set A U B = [x : v e A or x e B} (‘or’ has been used in the inclusive sense). 

Remark In set theory, we say that x e A or x e B. This means x belongs to at 
least one of A or B and may belong to both A and B. Clearly, A U A = A; if B 
is a subset of A, then A U B = A; if A = {1, 2, 3}, B = {3, 7, 8, 9}, then A U B = 
{1, 2, 3, 7, 8, 9}; A U 0 = A for every set A. 

Given two sets A and B , we can form a set that consists of exactly all elements 
common to A and B . This is called the intersection of A and B . 

Definition 1.1.2 The meet (or intersection) of two sets A and B , written as A Pi B, 
is the set A n B = {x : x e A and x e B). 

Clearly, AnA = A;if£isa subset of A, then A P\ B = B ; A H $ = 0; if A = 
{1, 2, 3}, B = {3, 7, 8, 9}, then AHB = {3}. 


Theorem 1.1.1 Each of the operations U and H is 

(a) idempotent : AUA = A = AH A, for every set A; 

(b) associative : A U (5 U C) = (A U 5) U C A H (5 H C) = (A H 5) H C /br 
<my three sets A, 5, C; 

(c) commutative : A U 5 5 U A AHfi = Sn A/br a/iy /W6> seta A, B; 

(d) (T distributes over U and U distributes over Pi: 

(i) A n (B U C) = (A n B) U (A n C); 

(ii) A U (S H C) = (A U fi) H (A U C) for any three sets A, B , C. 



1.1 Sets: Introductory Concepts 


5 


Proof Verification of (a)-(c) is trivial. We now give a set theoretic proof of (d)(i). 
The proof is divided into two parts: 

An(5uc)c(An5)u(Anc) 


and 


(Ans)u(Anc)cAn(5u c). 

Now, xeAD(BUC)^xeA and xeBUC^(xeA and v e B) or (x e A 
and ieC)=^reAn5orieAnC=^rG(AnS)U(AnC)=) > An(fiUC)c 
(A (T B) U (A fl C), as v is an arbitrary element of A Pi (B U C). We now prove the 
reverse inclusion. 

Since B c B U C, A Pi B c A (T (5 U C). 

Similarly, AnCcAfl(5UC). 

Consequently, (A n 5) U (A (T C) c A n (5 U C). 

Hence d(i) follows by Proposition 1 . 1 . 1 (i). □ 

Definition 1.1.3 Two non-null sets A and B are said to be disjoint iff A Pi B = 0. 

For example, if A is the set of all negative integers, then A n N + = 0. 

Definition 1.1.4 The (relative) complement (or difference) of a set A with respect 
to a set B, denoted by B — A (or B \ A) is the set of exactly all elements which 
belong to B but not to A, i.e., B — A = {xeB:x£ A}. 

If U is the universal set, the complement of A in U (i.e., with respect to U) is 
denoted by A c or A'. This implies (A')' = A, If = 0, 0' = U, A U A' = U, A H A' = 
0, B - A = B H A', B - A / A - B for A ^ B. 

Definition 1.1.5 The symmetric difference of two given sets A and B , denoted by 
A A £, is defined by A A B = (A - B) U (B - A). 

This implies clearly, AAB = BAA, A A(B A C) = (A A B) A C, A A0 = A, 
A A A = 0, A A B = (A U B) — (A (1 B) and AAC = BAC^A = B. 

It is extremely useful to imagine geometric figures in terms of which we can 
visualize sets and operations on them. A convenient way to do so is to represent the 
universal set U by a rectangular area in a plane and the elements of U by the points 
of the area. Sets can be pictured by circles within this rectangle and diagrams can be 
drawn illustrating operations on sets and relations between them. Such diagrams are 
known as Venn diagrams , named after John Venn (1834-1923), a British logician. 

Our three set operations are represented by shaded regions in Fig. 1.1. 

The cartesian product is one of the most important constructions of set theory. It 
enables us to express many concepts in terms of sets. The concept of cartesian prod- 
uct owes to the coordinate plane of the analytical geometry. It welds together the 
sets of a given family into a single new set. With two objects a , b there corresponds 
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Fig. 1.1 Venn diagrams 
representing three set 
operations: union, 
intersection, and set 
difference 



AU B An B A- B 


a new object (a, b), called their ordered pair. Ordered pairs are subject to one con- 
dition (a, b) = (c, d) iff a = c and b = d. In particular, ( a , b) = ( b , a) iff a = b\a is 
called the first coordinate and b is called the second coordinate of the pair (a,b). 
The ancestor of this concept is the coordinate plane of analytic geometry. 

Definition 1.1.6 Let A and B be two non-null sets (distinct or not). Their cartesian 
product A x B is the set defined by A x B = {( a , b ) : a e A,b e B}. 

Example 1.1.1 (i) If A = {1,2}, B = {2, 3, 4}, then Ax B = {(1, 2), (1, 3), (1,4), 
(2, 2), (2, 3), (2, 4)}. 

(ii) If A = B = R 1 (Euclidean line), then Ax B, being the set of all ordered pairs 
of reals, represents the points in the Euclidean plane. 

Proposition 1.1.3 Ax5/0 if and only if A 7^ 0 and B ^ 0. 

Proof If A x 5 / 0, then there exists some (a, b) e A x B such that a e A, b e B . 
This shows that A / 0. 5^0. Conversely, if A / 0 and 5^0, there exist some 
a e A and b e B. As the pair (a,b) e A x B, A x B / 0. □ 

Proposition 1.1.4 If C xD/0, then C x D c A x 5 ijfC c A D ^ B. 

Proof Left as an exercise. □ 

Proposition 1.1.5 For non-empty sets A and B,AxB = BxA if and only if 
A = B (the operation Ax B is therefore not commutative). 

Proof Left as an exercise. □ 

If a set A has n elements, then the set A x A has n 2 elements. 

The set of elements {(a, a) : a e A} in A x A is called the diagonal of A x A. 

Theorem 1.1.2 ‘x ' distributes over U, Pi and 

(i) A x (B U C) = A x B U A x C; 

(ii) Ax(5nC) = Ax5flAxC; 

(iii) Ax (B — C) = AxB — AxC. 

Proof Left as an exercise. □ 

Remark Definition 1 . 1 .6 is extended to the product of n sets for any positive integer 
n > 2. If A \ , A2 , . . . , A n are non-empty sets, then their product A\ x A2 x • • • x A n 
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is the set of all ordered ft -tuples (a\, < 22 , . . . , a n ), where at is in A z - for each i. If in 
particular, Ai = A2 = • • • = A n = A, then their product is denoted by the symbol 
A n . These ideas yield well known sets R” and C n . 

The operation ‘ x ’ is not associative: (A x B) x C ^ A x (B x C) in general [see 
Ex. 11 ofSE-I]. 

Family of Sets Let 7/0 be a set. If for each i e 7, there exists a set A z , then the 
collection of sets {A/ \i e 7} is called a family of sets, and 7 is called an indexing 
set for the family. For some i / j, Ay may be equal to A z . 

Any non-empty collection 72 of sets can be converted to a family of sets by ‘self 
indexing’. We use 72 itself as an indexing set and assign to each member of 72 the 
set it represents. 

We extend the notions of unions and intersections to family of sets. 

Definition 1.1.7 Let A be a given set, and { A z : i e I be a family of subsets of A}. 

The union [J - M ( or U iel Ai or U {iel) Ai or : z C 7}) of the family is 

the set (J ■ A z = {x e A : x e Ai for some iel} and the intersection p| • A z (or 
Hie/ or fW> ^ or p| {Ai : i e 7}) of the family is the set P|- A/ = {xeA:xe 
A z for each i e 7}. 

Clearly, for any set A, A = |J{{x} : v G A} = U xeaM- case °f infinite inter- 

sections and unions, some non-intuitive situations may occur: 

Example 1.1.2 (i) Let A p = [n e Z : n > p, p = 1, 2, 3, . . .}. Then Ai D A2 D 
A3 D • • • and each A p is non-empty set such that each A p contains its successor 
A p+ 1. Clearly, f]A p = 0. 

(ii) Let A n = [0, 1 — 2~ n ] and B n = [0, 1 — 3~ n ] for each positive integers. Then 
each A n is a proper subset of B n and A n / B r for each n and r . Clearly, 

(Ja„ = (Jb„ = [0,i). 

n n 

Definition 1.1.8 Let A be a non-null set. Its power set V(A) is the set of all subsets 
of A. 

Clearly, 0 g V(A) and A e V(A) for any set A; B c A B e V(A); a e A o 
{a}eV(A). 

Proposition 1.1.6 H, 'P(Ai) = ViCli A «')• 

Proof Left as an exercise. □ 

Remark [J i V(Ai) V(f_\ A z ). For example, let Ai = {1}, A2 = {2}. Then 
(J; V(Ai) has three elements, viz. 0, {1} and {2}, whereas PflJ- A z ) has four el- 
ements viz. 0, {!}, {2} and {1, 2}. 
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1.2 Relations on Sets 

In mathematics, two types of very important relations, such as equivalence relation 
and ordered relations arise frequently. Sometimes, we need study decompositions 
of a non-empty set X into disjoint subsets whose union is the entire set X (i.e., X 
is filled up by these subsets). Equivalence relations on X provide tools to generate 
such decompositions of X and produce new sets bearing a natural connection with 
the original set X. 

A binary relation R on a non-empty set A is a mathematical concept and intu- 
itively is a proposition such that for each ordered pair ( a , b) of elements of A, we 
can determine whether aRb (read as a is in relation R to b) is true or false. We 
define it formally in terms of the set concept. 

Definition 1.2.1 A binary relation R on a non-empty set A is a subset R c A x A 
and a binary relation S from A to B is a subset S of A x B. The pair (a, b) e R is 
also denoted as aRb. 

A binary relation S from A to B is sometimes written as S : A — > B . Instead of 
writing a binary relation on A, we write only a relation on A, unless there is any 
confusion. 

Example 1.2.1 For any set A, the diagonal A = {(a, a) : a e A} c A x A is the 
relation of equality. 

In a binary relation R on A, each pair of elements of A need not be related i.e., 
(a, b) may not belong to R for all pairs (a, b) e A x A. 

For example, if a / b, then (a, b) £ A and also (b, a) £ A. 

Example 1.2.2 The relation of inclusion on V(A) is {(A, B) e V(A) x V(A) : A c 
5}cp(A)x?(A). 


1.2.1 Equivalence Relation 

A fundamental mathematical construction is to start with a non-empty set X and to 
decompose the set into a family of disjoint subsets of X whose union is the whole 
set X , called a partition of X and to form a new set by equating each such subset 
to an element of a new set, called a quotient set of X given by the partition. For 
this purpose we introduce the concept of an equivalence relation which is logically 
equivalent to a partition. 

Definition 1.2.2 A binary relation R on A is said to be an equivalence relation on 
A iff 

(a) R is reflexive : (a, a) e R for all a e A; 

(b) R is symmetric : if (a, b) e R, then (b, a) e R for a, b e A; 

(c) R is transitive : if (< a , b) e R and (/?, c) e /?, then (< a , c) e for a,b,c e A. 
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Instead of speaking about subsets of A x A, we can also define an equivalence 
relation as below by writing aRb in place of (a, b) e R. 

Definition 1.2.3 A binary relation R on A is said to be an equivalence relation 
iff R is 

(a’) reflexive: aRa for all a e A; 

(b’) symmetric: aRb implies bRa for a, b e A; 

(c’) transitive: aRb and bRc imply aRc for a,b,c e A. 

Example 1.2.3 Define R onZby aRb O a — bis divisible by a fixed integer n > 1 . 
Then R is an equivalence relation. 

Proof Since a — a = 0 is divisible by n for all a e Z, aRa for all a e Z, hence R 
is reflexive. If a — b is divisible by n, then b — a is also divisible by n; hence R 
is symmetric. Finally, if a — b and b — c are both divisible by n, then their sum 
a — cis also divisible by n; hence R is transitive. Consequently, R is an equivalence 
relation. (See Example 1.2.9.) □ 

Example 1.2.4 Let T be the set of all triangles in the Euclidean plane. Define 
aR/3 Oa and f$ are similar for a, f> e T . Then R is an equivalence relation. 

Definition 1.2.4 A relation R on a set A is said to be antisymmetric iff xRy and 
yRx imply x = y for x, y e A. 

Example 1.2.5 Consider the inclusion relation ‘c’ on V(A) of a given set A. Then 
P c Q and Q c P imply P = Q for two subsets P, Q e V(A); hence ‘c’ is anti- 
symmetric. 

Remark The relations: reflexive, symmetric (or antisymmetric) and transitive are 
independent of each other. 

Example 1.2.6 A relation which is reflexive, symmetric but not transitive: Let R 
be the binary relation on Z given by (x, y) e R iff x — y = 0, 5 or — 5. Then R is 
reflexive, since x — x = 0 => (x, x) e R, for all x e Z. R is symmetric, since (jc, y) e 
R implies x — y = 0, 5 or — 5 and hence y — x = 0, — 5 or 5, so that (y, jc) e R. But 
R is not transitive, since (12, 7) e R and (7, 2) e R but (12, 2) ^ R. 

Example 1.2.7 A relation which is reflexive and transitive but neither symmetric 
nor antisymmetric: Let Z* be the set of all non-zero integers and R be the relation 
on Z* given by (< a , b) e R iff a is a factor &, i.e., iff a\b. Since a\a for all a eZ*;a\b 
and b\c =>a\c, hence R is reflexive and transitive. 2|8 but 8|2 is not true; hence R 
is not symmetric. Again 7 1 — 7 and — 7|7 but 7^—7; hence R is not antisymmetric. 

Example 1.2.8 A relation which is symmetric and transitive but not reflexive: Let 
R be the set of all real numbers and p be the relation on R given by (a, b) e p iff 
ab > 0. Clearly, p is symmetric and transitive. Since (0,0) £ p, p is not reflexive. 
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Definition 1.2.5 Let X be a non-null set. Then a family V = {A; : i e 1} of subsets 
of X is said to be a partition of X iff 

(i) Each Ai is non-empty, i g / ; 

(ii) Af n Aj = 0 for all i / j, i, j g /; 

(hi) UM/ : A,- gP) = L 

Thus a partition of X is a disjoint class of non-empty subsets of X whose union 
is the set X itself and a partition of X is the result of splitting the set X into non- 
empty subsets in such a way that each x e X belongs to one and only one of the 
given subsets. 

Partition of a set is not unique. For example, if X = {1, 2, 3, 4, 5, 6 }, then 
{1, 3, 5}, {2, 4, 6 } and {1, 2, 3, 4}, {5, 6 } are two different partitions of X . 

We shall show in Theorem 1.2.9 that there is a bijective correspondence between 
the set of equivalence relations on a set X / 0 and the set of all partitions of X. 

Theorem 1.2.1 Let p be an equivalence relation on a set X ^ 0 and let for 
each x G X, the class (x), denoted by (x) (or [x]), be defined by (x) = [y G X : 
(x, y) e p}. Then 

(i) For each x g X, x g (x); 

(ii) If x, y G X, either (x) = (y) or (x) H (y) = 0; 

(hi) \J{(x):xeX} = X; 

(iv) IfVp = {(x) :x G X }, then V p is a partition of X, called the partition induced 
by p and denoted by X/ p. 

Proof (i) Since p is reflexive, (x, x) e p for each xel Hence by definition of (x), 
x g (x). 

(ii) Let (x) (T (y) 7 ^ 0. Then there exists some z € (x) fl (y). Consequently, 
(x,z) G p and (y,z) e p by definition of classes. Let w be an arbitrary element 
such that w g (x). Then (x,w)ep. Now (x, z) G p => (z, x) G p as p is symmetric. 
So, (z, x) g p and (x, w) g p =>► (z, u;) G p as p is transitive, and hence 

(y, z) e p and (z, tc) g p =4> (y, w) g p =4> w g (y) =>► (x) c (y) 


( 1 . 1 ) 


as w is an arbitrary element of (x). 

Interchanging the role of x and y by symmetric property of p, we have 


OO c (x). 

Hence (1.1) and (1.2) show that (x) = (y). 

(iii) Since for each x gX,x g (x), it follows that 


( 1 . 2 ) 



(1.3) 


Again as (x) c I for each x g X , it follows that 


|J{(x):xeX}cx. 

Hence (1.3) and (1.4) show that UK X ) - x e X] = X. 


(1.4) 


1.2 Relations on Sets 


11 


(iv) By using (ii) and (iii), it follows by definition of a partition that V p is a 
partition of A. □ 

The class (x) associated with p is sometimes denoted by (x) p to avoid any con- 
fusion. The set (x) is called the equivalence class (with respect to p) determined 
by x. 

Converse of Theorem 1.2.1 is also true. 

Theorem 1.2.2 Let V be a given partition of a non-null set X. Define a relation 
p = pp (< depending on V) on X by (x, y) e pp iff there exists A e V such that 
x,y E A (; i.e ., iff y belongs to the same class as x). Then pp is an equivalence 
relation , induced by V. Moreover ; 

(a) V(pp) = V ( partition induced by pp); 

(b) pifPp) = p ( equivalence relation induced by partition V p ). 

Proof Let x e A. Since V is a partition, there exists Aef, such that x e A. So, 
(x, x) e p. This implies p is reflexive. If (x, y) e p, then (y, x) e p from the defi- 
nition of p. So, p is symmetric. Let (x, y) e p and (y, z) e p. Then there exist A, 
B e V such that x, y e A and y, z € B. Consequently, y e AH B. Since V is a par- 
tition, A = B. Consequently, x, z G A eP; therefore, (x, z) e p. So, p is transitive. 
As a result p is an equivalence relation. 

(a) We now show that V(pp) = V. Let A eV and x e A. Then for every y e 
A, (x, y) e pp. Consequently, y e (x)pp => A c (x)pp. Next let z e (x)pp. Then 
there exists some B eV such that x,z £ B. But xe A=^xeAn5=^A = 5 (by 
the property of a partition). Consequently, z e A =>► (x)pp c A. As a result, 

(x)p-p = A, but (x)p-p E V(pp). 

Consequently, P c V(pp). Moreover both P and V(pp) are partitions of the same 
set A. Clearly, V = V(pp). 

(b) To prove p(V p ) = p, let (x, y) e p.Theny e (x) p eV p ^ (x, y) e p(P p ) =>► 

P ^ pfPp)- Again (x, y) e p(V p ) => there is an equivalence class (z) p such that 
x, y e (z) p =>► (z, x) e p and (z, y) e p ^ (x,z) e p and (z, y) e p ^ (x,y) e p 
(by transitive property of p) =>► p( V p ) Cp. Asa result p = p( V p ). □ 

The disjoint classes (x) into which a set A is partitioned by an equivalence re- 
lation p constitute a set, called the quotient set of A by p, denoted by A/p, where 
(x) denotes the class containing the element x E A. Each element x of the class (x) 
is called a representative of (x) and a set formed by taking one element from each 
class is called a representative system of the partition. 

The following example shows that the quotient set of an infinite set may be finite. 

Example 1.2.9 (i) The set of residue classes Z n : In Definition 1.2.3, we have defined 
an equivalence relation R on Z; when an integer a (positive, negative or 0) is divided 
by a positive integer n , there are n possible remainders, viz., 0, 1, . . . , n — 1. If r is 


12 


1 Prerequisites: Basics of Set Theory and Integers 


the remainder for a , then a — r is divisible by n, so that a e (r) = {y e Z : y — r 
is divisible by n}. Hence every integer belongs to one and only one of the n classes 
(0), (1 ),..., (n — 1). Thus the quotient set Z/R consists of the n distinct classes 
(0), (1 ),..., (n — 1); called the set of residue classes modulo n and is denoted by Z n . 
The integers 0, 1 , . . . , n — 1 form a representative system of the partition. 

The equivalence relation R on Z defined in this example is called the congruence 
modulo n. Two integers a and b in the same residue class are said to be congruent, 
denoted by a = b (mod n). The set Z n provides very strong different algebraic struc- 
tures which we shall study in the subsequent chapters. 

(ii) Visual Description of Z\ 2 '. We can use a clock to describe Z \2 visually. Take 
the real number line R 1 and wrap it around a circumference bearing the numbers 1 
to 12 located as usual on a clock. Since R 1 is infinitely long, it will wrap infinitely 
many times around the circle, and so each ‘hour point’ on the clock will coincide 
with infinitely many integers. The integers located at the hour r for r = 1, 2, . . . , 12 
are all integers x such that x — r is divisible by 12 i.e., are all integers congruent to 
r modulo 12 (see also clock arithmetic in Sect. 2.7.3 of Chap. 2). 

Example 1.2.10 Let p be the binary relation on N + x N + defined by (a, b)p(c,d)0 
ad = be. Thus p is an equivalence relation. We assume commutative, associative, 
and cancellative laws for multiplication on N + . The relation p is clearly reflexive 
and symmetric. Next suppose ( a,b)p(c,d ) and ( c,d)p(e , /). Then ad = be and 
cf = de => ( ad)(cf ) = ( bc)(de ) => af = be => (a, b)p(e, f) => p is transitive. 
Consequently, p is an equivalence relation on N + x N + . 

Remark If the ordered pair (a,b) e N + x N + is written as a fraction |, then the 
above relation p is the usual definition of equality of two fractions: | = | 
ad = be. 


1.2.2 Partial Order Relations 

We have made so far little use of reflexive, antisymmetric, and transitive laws. We 
are familiar with the natural ordering < between two positive integers. This example 
suggests the abstract concept of a partial order relation, which is a reflexive, anti- 
symmetric, and transitive relation. Partial order relations and their special types play 
an important role in mathematics. For example, partial order relations are essential 
in Zorn’s Lemma, which provides a very powerful tool in mathematics and in lattice 
theory whose applications are enormous in different sciences. 

Definition 1.2.6 A reflexive, antisymmetric, and transitive relation R on a non- 
empty set P is called a partial order relation. Then the pair (P, R) is called a par- 
tially ordered set or a poset. 

We adopt the symbol ‘<’ to represent a partial order relation. So writing a < b 
in place of aRb , from Definition 1.2.6 it follows that 
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Fig. 1.2 Hasse diagram 
representing a < b 


(i) a < a for all a e P ; 

(ii) a < b and b < a in P ^ a = b for a, b e P and 
aii) a <b and Z? < c in P ^ a < c for a,b,c e P. 

The following three examples are quite different in nature but possess identical 
important properties. 

Example 1.2.11 (i) (R, <) is a poset, where ‘<’ denotes the natural ordering in R 
(i.e., a <b O b — a > 0). Similarly, (R, >) is a poset under natural ordering > in R. 

(ii) (V(A), <) is a poset under the inclusion relation c between subsets of A. 

(iii) (N + , <) is a poset under divisibility relation (i.e., a <b O a divides b in N + , 
denoted by a\b). 

A poset with a finite number of elements can be conveniently represented by 
means of a diagram, called the Hasse diagram , where a < b is represented by means 
of a diagram of the form as shown in Fig. 1.2, where a , b are represented by small 
circles, a being written below b and the two circles being joined by a straight line. 
Thus we have the following examples of posets as described in Fig. 1.3 of Exam- 
ple 1.2.12. 

Example 1.2.12 In Fig. 1.3, we have shown six different partial order relations 
((i)-(vi)). 

In a poset (P, <), we say a<biffa<b and a ^ b. If a < b, we say that a 
precedes b (or a is a predecessor of b or b succeeds a). 

Definition 1.2.7 If a poset (P, <) is such that, for every pair of elements a , b 
of P, exactly one of a<b,a = b,b<a holds [i.e., if any two elements of P are 
comparable ], then (P, <) is called a totally {fully or linearly) ordered set or simply 
an ordered set or a chain. 

Remark The partial order relation in a chain has a fourth property: ‘any two ele- 
ments are comparable’, in addition to the properties required in Definition 1.2.6. 



Fig. 1.3 Different partial order relations 
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Example 1.2.13 (i) The poset (R, <) (in Example 1.2.1 l(i)) is totally ordered. 

(ii) The poset (N + , <) (in Example 1.2. 1 l(iii)) is not totally ordered, since the 
integers 5 and 8 are not comparable, as neither divides the other. 

(iii) The poset (V(X), <) (in Example 1.2.1 l(ii)) is not totally ordered, provided 
that X has at least two elements. For example, if X = (1, 2, 3}, A = {1, 2}, B = 
{2, 3}, then neither AC5 nor B c A holds. 

Definition 1.2.8 Let (P, <) be a poset and a, b e P. An element x e P, where 
x < a and x <b is called a lower bound of a and b. If v is a lower bound of a, b in 
P and y < x holds for any lower bound y of a and b, then x (uniquely determined) 
is called the greatest lower bound (gib or infimum or meet) of a and b and is denoted 
by a A b (or ab). 

Similarly, an element x e P, where a < x and b < x is called an upper bound of 
a and b. If x is an upper bound of a, b in P and x < y holds for any upper bound 
y of a and b , then x (uniquely determined) is called the least upper bound (lub or 
supremum or join ) of a and b and is denoted by a V b (or a + b). 

Definition 1.2.9 A poset in which each pair of elements 

(i) has the lub is called an upper semilattice ; 

(ii) has the gib is called a lower semilattice ; and 

(iii) has both the lub and the gib are called a lattice. 

Example 1 .2. 12(ii)— (iv), as shown in Fig. 1.3, are both upper and lower semi- 
lattices and hence are lattices. But Example 1 .2. 12(i), (v) and (vi) are not so. In- 
deed, Example 1 .2. 12(i) is neither an upper semilattice nor a lower semilattice; 
Example 1.2.12(vi) is an upper semilattice but not a lower semilattice; and Exam- 
ple 1.2.12(v) is a lower semilattice but not an upper semilattice. 

The definition of a lower bound and upper bound, gib and lub, can be generalized 
for any given collection of two or more elements of a poset. 

Definition 1.2.10 A lattice (P, <) is said to be complete iff every non-empty col- 
lection of elements of P has both a gib and a lub. 

Example 1.2.14 (i) ( V(A ), <) is aposet (cf. Example 1.2.11(h)). Let P, Q e V(A). 
Then P n Q c P and fHgc g, so P (T g is a lower bound of the pair P, Q. Also, 
if R is any lower bound of the pair P, Q , i.e., R c P and R c g, then R c P n g. 
This shows that P Pi g is the gib of the pair P, Q e V(A) i.e., P A g = P n g. 
Similarly, P V Q = P U Q. Thus every pair of elements of V{A) has both the gib and 
the lub. Hence (V(A), <) is a lattice. Again for any non-empty collection of subsets 
of A, their meet and join are the gib and lub of the given collection, respectively. 
Hence (' P(A ), <) is a complete lattice. 

(ii) The poset (N + , <) in Example 1.2.11 (iii) is a lattice, called the divisibil- 
ity lattice, where x A y = gcd(v , y) and x Vy = lcm(x, y), where gcd(v, y) and 
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lcm(x , y ) denote, respectively, the greatest common divisor and least common mul- 
tiple of x, y e N + . But this lattice is not complete. For, the subset X = {1, 2, 2 2 , . . .} 
has no lub. 

Remark A totally ordered set is a lattice, but its converse is not true in general. From 
Examples 1.2.14(h) and 1.2.13(h) it appears that (N + , <) is a lattice, but it is not 
totally ordered. 

Definition 1.2.11 A lattice (L, <) is said to be distributive iff x v (y A z) = (x v 
y) A (x V z) (or, equivalently, x A (y V z) = (jc a y) V (x A z)) for all x, y, z € L. 

Definition 1.2.12 A lattice (L, <) is said to be modular (or Dedekind) iff for 
x,y,zeL,xVy = zVy,xAy = zAy and x < z imply x = z (or, equivalently, 
for x, y, z G L, x < z, the modular law (x V y) A z = x V (y A z) holds in L). 

Example 1.2.15 (i) The divisibility lattice (N + , <) is modular and distributive. 

(ii) Every distributive lattice is modular but the converse is not true. 

[Hint, (ii) Let (L, <) be a distributive lattice and x, y, z G L be such that x < z. 
Then x V (y A z) = (x V y) A (x V z) = (x V y) A z (as L is distributive). This shows 
by Definition 1.2.12 that L is modular.] 

Note that its converse is not true. (See Ex. 10 of Exercises-I and Ex. 20 of SE-I.) 

Theorem 1.2.3 If in a poset ( P , <), every subset ( including 0) has a gib in P, then 
P is a complete lattice. 

Proof Left as an exercise. □ 

Remark Any non-void complete lattice contains a least element 0 and a greatest 
element 1. 

Corollary If a poset P has 1, and every non-empty subset X of P has a gib , then 
P is a complete lattice. 

Example 1.2.16 The pentagonal lattice (L, <) is not modular, since v(u + b) = v 
and u + vb = u. As u <v and u + vb < v(u + b), the modular law does not hold in 
L and L is therefore non-modular as shown in Fig. 1.4. 

Theorem 1.2.4 A lattice is non-modular iff it contains the pentagonal lattice as a 
sublattice. 

Proof Let (L, <) be non-modular lattice. Then in L there exist five elements u, v, 
b,u-\-b = v-\-b,ub = vb, which form the pentagonal lattice as shown in Fig. 1.5. 
For the validity of the converse, see Example 1.2.16. □ 
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Fig. 1.4 Pentagonal lattice 1 



Fig. 1.5 Pentagonal lattice 
with u + b = v + b = \ and 
ub = vb = 0 



Definition 1.2.13 Let ( P , <) be a poset. An element x e P is said to be a minimal 
element of P iff a e P and a <x imply a — x. Similarly, an element x e P is said 
to be a maximal element of P iff a e P and x < a imply a —x. 

Thus an element x e P is minimal (or maximal) iff no other element of P pre- 
cedes (or exceeds) x . 

Example 1.2.17 (i) The poset (R, <) (cf. Example 1.2.1 l(i)) has neither minimal 
nor maximal elements. 

(ii) The minimal elements of the poset (V(X) — 0, <) are the singleton subsets 
of X. 

(iii) In the divisibility lattice, the prime numbers serve as minimal elements. 

Note Lattices are used in different areas, such as theoretical computer science, 
quantum mechanics, social science, biosystems, music etc. For applications of lat- 
tices to quantum mechanics, the books (Varadarajan 1968) and (Von Neumann 
1955) are referred to. 

Our main aim is now to state ‘Zorn’s Lemma’. A non-constructive criterion for 
the existence of maximal elements is given by the so called maximality principle , 
which is called Zorn’s Lemma (the principle goes back to Hausdorff and Kura- 
towski, but Zorn gave a formulation of it which is particularly suitable to algebra, 
analysis, and topology). 

Lemma 1.2.1 (Zorn’s Lemma) Let ( S , <) be a non-empty partially ordered set. 
Suppose every subset A c S which is totally ordered by < has an upper bound 
(in S ). Then S possesses at least one maximal element. 
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Remark Zorn’s Lemma is indispensable to prove many interesting results of math- 
ematics, one is given in this section and some are given in subsequent chapters. 

An abstract Boolean algebra has a close connection with lattices. The following 
is necessary to understand the concept of Boolean algebra. 

By a finite collection of sets we always mean one which is empty or consists of 
n sets for some positive integer n and by finite unions and finite intersections we 
mean unions and intersections of finite collection of sets. If we say that a collection 
B of sets is closed under the formation of finite unions, we mean that B contains 
the union of each of its finite sub-collections; and since the empty sub-collection 
qualifies as a finite sub-collection of B, we see that its union and the empty set, each 
is necessarily an element of B. In the same way, a collection of sets, which is closed 
under the formation of finite intersections is necessarily an element of B. 

We assume that universal set U / 0. 

Definition 1.2.14 A Boolean algebra of sets is a non-empty collection B of subsets 
of U which satisfies the following conditions: 

(i) A and B eB imply AU B e B; 

(ii) A and B e B imply A D B e B; 

(iii) A g B implies A! e B. 

Since B / 0, it contains at least one set A. 

Condition (iii) shows that A ' is in B along with A and since A H A' = 0 and 
A U A' = U, (i) and (ii) show that 0eB and U e B. Clearly, {U, 0} is a Boolean 
algebra of sets and every Boolean algebra of sets contains it. The other extreme is 
the collection of all subsets V(U) of U, which is also Boolean algebra of sets. 

Let B be a Boolean algebra of sets. If {A\, A 2 , . . . , A n j is a non-empty finite 
sub-collection of B, then A\ U A 2 U • • • U A n e B and A\ H A 2 n • • • n A n e B. 
Moreover 0, U e B. Clearly, B is closed under the formation of finite unions, finite 
intersections and complements. Conversely, let B be a collection of sets which is 
closed under the formation of finite unions, finite intersections, and complements. 
Then 0eB and U e B. Clearly, B is a Boolean algebra of sets. Consequently, we 
may characterize Boolean algebras of sets as collections of sets which are closed 
under the formation of finite unions, finite intersections, and complements. 

Remark An abstract Boolean algebra may be defined by means of lattices (Ex. 7 of 
Exercises-I). 


1.2.3 Operations on Binary Relations 

Operations on pairs of binary relations arise in many occasions in the study of mod- 
em algebra. Such an operation is defined now. 

Let R : X — > Y and S : Y Z be two binary relations. Then the composite S o R 
of R and S is defined by 
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S o R = {(x, z) G X x Z: if there exists y e Y such that (x, y) g 7? and (y, z) G 

5}a x z. 

So, S o R is a binary relation from X — > Z. 

If 7? is a binary relation from X to Y, the inverse R~ l is defined by 

/H = {<j,x) :(x,}>)efi}cFx Z. 

So, is a binary relation from Y to Z. 

Proposition 1.2.1 Let R : X — > Y and S :Y — > Z T : Z ^ W be binary rela- 
tions. Then 

(i) (T o S) o R = T o (S o R) (, associative property ); 

(ii) (SoR)~ l = R~ l oS~ l . 

Proof (i) Clearly, both (T o S) o R and T o (S o R) are binary relations from X 
to W. Now (x,w) e (T o S) o R there exists y g F such that (x, y) e R and 
(y,w) eT o S, where x e X, w e W there exists y g T such that (x, y) g 7? and 
there exists z G Z such that (j,z)g5 and (z,w) eT o there exist y g F and z g Z 
such that (x, y) g 7?, (y , z) G S and (z, w) eT. Thus (x, tu) g (T 7 o S') o 7? 3z G Z 
such that (x, z) G S' o R and (z, w) e T O (x,w) eT o (S o R). 

Consequently, (T o S) o R = T o (S o R). 

(ii) Again SoR:X->Z^(So R)~ l :Z^X. 

Clearly, both ( S o R)~ l and R~ l o S~ l are relations from Z to X. 

Then for z G Z, x g X, (z, x) g (5 o 7?) _1 (x, z) G S' o 7? Ely g 7 such that 

(x, y) G 7? and (y, z) G S' Ely g Y such that (y, x) g 7? _1 and (z, y) G S' -1 
Ely g T such that (z, y) G .S' -1 and (y , x) g 7? _1 (z, x) g 7? -1 o 1 . 

Consequently, ( S o 7?) _1 = R~ l o S -1 . □ 

Example 1.2.18 Let 7? be a relation on a set S. Then R is an equivalence relation 
on S iff 

(i) ACT?, where A = {(x, x) : x g S}; 

(ii) R = 7? _1 and 

(iii) 7? o R c 7? 

hold. 


/. 2.4 Functions or Mappings 

The concept of functions (mappings) is perhaps the single most important and uni- 
versal notion used in all branches of mathematics. Sets and functions are closely 
related. They have the capacity for vast and intricate development. We are now in a 
position to define a function in terms of a binary relation. 
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Definition 1.2.15 Let X and Y be two non-empty sets. A function f from X to Y 
is defined to be a binary relation / such that 

(x, y) € f and (x, z) € f imply y = z 

(i.e., / is single valued). 

The domain of f denoted by dom /, range of f denoted by range / are defined 
by 

dom / = {x e A : (x, y) e / for some y e 7} c X; 
range / = {y e 7 : (x, y) e f for some iGlJcf 

Remark 1 Definition 1.2.15 means for each x e dom/, there exists a unique 
y e range/ c Y such that (x,y) e /. Thus a function / from a set X to a 
set 7 is a correspondence which assigns to each x e dom / exactly one element 
y e range / c 7. If (x, y) e /, we write y = /(x); y is called the image of x under 
/ and x is called a preimage of y under /. 

Remark 2 In elementary calculus, all the functions have the same range, namely, 
the real numbers, (depicted geometrically as y-axis), in algebra there are many dif- 
ferent ranges, so that when we introduce a function it is important to specify both 
the domain and the range of the function as part of the definition of a function. 

f 

The notation / : X -> Y (or f : X — > 7) is used to mean that dom / = X. 
Sometimes it is convenient to denote the effect of the function / on an element x of 
X by x i-> fix). 

A function / : X -> Y is sometimes called a map or mapping. 

Remark The definition of function identifies with the graph of a function in its usual 
definition (by means of correspondence). 

Definition 1.2.16 Two functions f,g:X^Y are said to be equal, denoted by 
/ = g, iff / O) = g(x) Vx € X. 

Definition 1.2.17 Let / : X — > Y be a binary relation such that if A c X and 
ficy, then the image of A under /, denoted by /(A), is defined by 

/(A) = {y g Y : (x, y) e / for some x e A} c 7; 

the inverse image of B under / denoted by f~ l (B) is defined by 

= {x e X : (x, y) e / for some y e B) = [x e X : /(x) efi|ci 

If Bn range / = 0, /“ 1 (B) = 0. 
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Let f : X ^ Y and Aci Thus /(A) is defined by 

/(A) = [y g Y : y = /(x) for some igAci} 

= set of all images of points in A under /. 

Similarly, f~ l (B) = Uve^/ -1 ^)}- 

Theorem 1.2.5 Let f : X — > Y be a binary relation and A, B cx and C , D c Y . 
Then 

(1) /(AUB) = /(A)U/(B); 

(2) /(Anfi)C/(A)n/(S); 

(3) /- 1 (CUD) = /-'(C)U/- 1 (Z)); 

(4) r 1 (cnfl) = /- 1 (C)n/- 1 (fl). 

More generally , z/{A; : i g /} {5y : j e/} are two families of subsets X and Y, 

respectively , 

(!’) /(U e /M< :ie/}) = U/ 6 //(^i); 

(2’) /(n{A, :/e/})cn{/(A ; ) :ie/}; 

(3’) /-'(Uffij : 7 e ■/)) = U{/ _1 (By) : 7 e /}; 

(4’) /"‘OBy : J e 7}) = n(/ _I (^) : ./ € /}• 

Proof (L) y g /(IJMz : i e /}) =lx g |J A z such that y = /(x) 3x e A z - for 
some i g / and y = /(x) for some i g /, y G /(A z ) <^y e |J{/(A;), i £ /}. 
This proves (1’). 

(2’) Clearly, M = p|{A z : * e f for each i e I => f(M) c /(A;) for each 
i G / =>► /(Af) c P|{/(A Z ) : i G /}. This proves (2’). 

(3’) x e /“kUifiy :je;})^3ye|J Bj ™ch that /(*) = y^3yeBj 

for some / e / such that /Yx) = y^ie f _1 (B; ) for some j € J o x € 
\J{f~ l (Bj ) : ; e 7}. This proves (3’). 

(4’) X e /-hntfly : j e •/}) ^ /(*) € /(x) e By, for each 

7 e J <s> x e / 1 (By), for each 7 e J x e f]j e j f 1 (By). This proves (4’). □ 

Remark Equality in (2) i.e., /(A Pi B) = /(A) Pi f(B) does not occur in general; 
but occurs iff x\ 7^ X 2 in X =>► /(xi) 7^ /(X2) for all pairs x\, X2 G X i.e., iff / is 
1-1 (see Definition 1.2.18). 

Example 1.2.19 Consider the map / : R 2 —> R 2 by /(x, y) = (x, 0). Let A = 
{(x, y) : x — y = 0} and B = {(x, y) : x — y = 1}. Then A Pi B = 0 and /(A) Pi 
f(B)=x axis, because geometrically, A, 5 represent parallel lines and both /(A) 
and /(£) represent x-axis. Consequently, /(APl5) = 0^ /(A) PI f(B). 

Remark Let there exist a pair xi , X2 G X such that x\ 7^ X2 but /(xi) = /(x 2) = y 
(say). Take A = {xi }, B — {X2}. Then A Pi B — 0, /(A Pi 5) = 0, /(A) = {y } = 
f(B). So, /(A) H /(£) = {y} g /(A Pi B). Equality holds iff xi / x 2 in X =► 
/(xi) 7^ /(X2) i.e., iff / is 1-1 (see below). 
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Definition 1.2.18 A map / : X — > Y is said to be 

(i) injective (or one-one i.e., 1-1) iff x , x' e X , x ^ x' ^ f(x) f(x '); i.e., iff 

Vjc, jc' e X, /(jc) = /(jt') x = x'; 

(ii) surjective (or a surjection or onto) iff f(X ) = 7 ; i.e., iff for each y e 7, 3 some 
x e X such that f(x) = y; 

(iii) bijective (or a bijection or a one-to-one correspondence) iff / is both injective 
and surjective. 

Clearly, for any non-null set X , the identity map lx : X — > X defined by I x (x) = 
i Vi G I is bijective. 

Remark Dirichlet observed that for two natural numbers m and n (m > n), there is 
no injective map / :{l,2,...,m}— >{l,2,...,n} and proved the following princi- 
ple known as the ‘ Pigeonhole Principle \ This principle states that/ar ftva natural 
numbers m and n (m > n ), if m objects are distributed over n boxes, then some 
box must contain more than one of the objects. In other words, if n objects are dis- 
tributed over n boxes in such a way that no box receives more than one object, then 
each box receives exactly one object. 

We now consider some other important concepts related to maps that are fre- 
quently used in different branches of mathematics. 

Definition 1.2.19 Given a function / : X — > 7 and A C X, the function from A 
to Y (i.e., the function considered only on A), given by a i-> /(a), Va e A is called 
the restriction of f to A and is denoted by f\ A : A -> Y. 

In particular, I x \ A : A — > X is called the inclusion map of A into X and is de- 
noted by i : X. 

To the contrary, given A C X and a function / : A — > 7, a function F : X ^ Y 
coinciding with / on A (i.e., satisfying F| A = /) is called an extension of f over X 
relative to Y. 

It is clear from the definitions that restriction of a function is unique but extension 
of a function is not unique. 

If / : X — > 7, A C X, and g = /| A : A — > 7, then / is an extension of g over X. 

Definition 1.2.20 Let / : X — > Y and g : Y — > Z be two functions. The composite 
of / and g is the function X — >► Z given by v i-> g(/(v)), tel 

The composite function is denoted by g o / or simply by gf. Clearly g o / is 
meaningful whenever range / c domg. 

Proposition 1.2.2 Let f : X — > 7 g : Y ^ Z be two maps. Then 

(i) / and g are injective imply that g o f is injective ; 

(ii) / aftd g are surjective imply that g o f is surjective ; 

(iii) g o f is injective implies that f is injective ; 

(iv) g o f is surjective implies that g is surjective ; 

(v) / aad g are bijective imply that g o f is bijective ; 

(vi) g o f is bijective implies that f is injective and g is surjective. 
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Proof (i) Let / and g be injective. Then for each x,x' e X , ( g o f)(x) = ( g o 
/)(*') =>■ g(f(x)) = g(f(x')) (by definition of g o /) =>■ f(x) = f(x') as g is 
injective =>■ x = x' as / is also injective =^go/:X— >-Zis injective. 

(ii) Suppose /, g are surjective. Then (g o f)(X) = g(f(X)) (by definition of 
go/) = g(7)(as/is surjective) = Z as g is surjective =>► g o / is surjective. 

(v) From (i) and (ii) it follows that if / and g are bijective, then g o f is also 
bijective. 

(iii) , (iv) and (vi) are left as exercises. □ 

Remark If g o / is bijective, neither / nor g may be bijective. 

Example 1.2.20 Let X = {1,2,3}, 7 = {3,4, 5,6}. Define / : X 7 and g : 
7^ Xby 


/ (1) = 3, / (2) = 4, / (3) = 5; 

g(3) = l, g(4) = 2, g(5) = 3, g(6) = 3. 

Then g o / = lx . As 7x is bijective, g o / is bijective but neither / nor g is bijective. 

Theorem 1.2.6 Let f : X -> 7, g : 7 -> Z < 2 / 2 d h : Z ^ W be maps. Then (hog)o 
f = h o (g o f) (, associative law). 

Proof ((h o g) o /)(*) = (h o g)(/(*)) = h(g(f(x))) and (ho (go f))(x) = h((g o 
f)(x)) = h(g(f(x)), for all x e X. Consequently, (h o g) o / = h o (g o /). □ 

Let / : X — >► 7 be a binary relation. Then f~ l is always defined as a binary 
relation, but it need not be a function. 

For example, if / : R -> R + is defined by f(x) = x 2 , Vv e R, then f~ l (x) = 
{zbv 1 / 2 } for v > 0 => f~ l is not a function as / _1 (x) consists of two elements. 

In order that f~ l : 7 — >► X is a function it is necessary that (y,xi), (y,X 2 ) E 
/ -1 =^xi =x 2 i.e., (xi,y), (x 2 , y) e / =^x i =x 2 i.e., /(xi) = /(x 2 ) =^x x =x 2 
i.e., / is injective. 

Theorem 1.2.7 Let f : X ^ Y be a map. Then 

(i) / is injective iff there is a map g : 7 — >► X such that g o f = lx. 

(ii) / is surjective iff there is a map h : 7 —> X such that f oh = Iy- 

Proof (i) Let / be injective. Then for each y e f(X) there is a unique x e X such 
that f(x) = y. Choose a fixed element xo e X . 

Define g : 7 — > X by 

_\x if ye f(X ) and /(x) = y 

{xo ify^/(X) 

Then g o f = I x . 

Conversely, let there be a map g : 7 — >► X such that g o / = Since Ix is 
bijective, it follows from Proposition 1.2.2(vi) that / is injective. 


1.2 Relations on Sets 


23 


(ii) Suppose / is surjective. Then / _1 (y) c X is a non-empty set for every 
y e F. For each y e F, choose x^ e f~ l (y ). Then the map h : Y ^ X, y \-^ x y 
is such that f oh = Iy. 

Conversely, let there be a map h : Y X such that f oh = Iy. Since 7y is 
bijective, / o h is bijective and hence by Proposition 1.2.2(vi) it follows that / is 
surjective. □ 

The maps g and h defined in Theorem 1.2.7 are, respectively, called a left inverse 
of f and a right inverse of f . 

If a map / : X Y has both a left inverse g and a right inverse h, then they 
are equal. This is so because g = g o Iy = g o (/ o h) = (g o /) o h = lx o h = h. 

The map g = h is called an inverse (or two sided inverse) of /. Thus the inverse 
of a map / (if it exists) is unique. 

Corollary A map f : X —> Y is bijective / has an inverse there exists g : 
F —> X such that g o / = lx and f o g = Iy. 

Proof Left as an exercise. □ 

The unique inverse of a bijection / is denoted by / _1 . Thus if / is bijective, 
then for each y e F, there exists a unique element x e X such that f(x) = y and 
hence f~ l {y) consists of a single element of X. Therefore there exists a corre- 
spondence that assigns to each y e F, a unique element f~ l {y) e X. Accordingly, 
f~ l : F X is a function. 

A map satisfying any one condition of Corollary of Theorem 1.2.7 is said to be 
invertible. 

Theorem 1.2.8 Let f : X — >► F and g : F — > Z /zave inverse functions f~ l : 
F — >► A g -1 : Z — >► F. composite function g o f : X ^ Z has an 

inverse function which is f~ l og _1 : Z — > X. 

Proof As / : X — > F and g : F Z have inverse functions, then / -1 o / = 7^, 
/ ° f~ l = g _1 o g = 7y, g o g -1 = 7 Z . 

Now 

(/“* ° g _1 ) °(g°f) = f~ l ° (g _1 ° (g ° /)) = / _1 o (U _1 °g)°f ) 

= / _ 1 o/fo/ = /- 1 o/ = / x . 

Similarly, (g o /) o (/ _1 o g _1 ) = / z . 

Hence the theorem follows by using the Corollary of Theorem 1.2.7. □ 

We now reach the climax of this sequence of theorems. 

Theorem 1.2.9 If X is a non-null set , then the assignment p \-+ X/p defines a 
bijection from the set E(X) of all equivalence relations on X onto the set T(X) of 
all partitions of X. 
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Proof If p is an equivalence relation on X , the set X/p of equivalence classes is the 
partition V p of X by Theorem 1.2.1 so that p i-> X/p = V p defines a function 

f:E(X)^V(X). 

Define a function g : V(X) E(X) as follows: 

lfV = {Xi :i e 1} is a partition of X , let g(V) be the equivalence relation pp on 
X given by: 

(a, b) e pp <=> a e Xi and b e Xi for some (unique) i e I. 

Then pp is an equivalence relation on X by Theorem 1.2.2. Hence g is well 
defined. It is clear that g o / = Ie(X) and / o g = Ip . 

Because g(f(p)) = g(V p ) = p(V p ) = p (by Theorem 1.2.2(b)), for all p e 
E(X)^gof = I E(xy 

Again (/ o g)(P) = f ( p-p ) = V(p-p) = V (by Theorem 1.2.2(a)), for all V e 
V(X)^fog = I V(xy 

Then the theorem follows from Theorem 1.2.7. □ 

Obviously, the question arises whether or not given two sets have the same num- 
ber of elements. For sets of finite number of elements the answer is obtained by 
counting the number of elements in each set. But the difficulty arises if the number 
of elements of the set is not finite. 

The following concept, introduced by German mathematician Georg Cantor 
(1845-1918), removed this difficulty and led to the theory of sets. 

Definition 1.2.21 A set X is equivalent (or similar or equipotent) to a set Y denoted 
by X ~ Y iff there is a bijective map / : X — > Y. 

Theorem 1.2.10 The relation ~ on the class of all sets is an equivalence relation. 

Proof X ~ X as lx : X — >► X is a bijective map for every X. Again X ~ Y => 
Y ~ X, since a bijective map / : X — >► Y has an inverse f~ l : Y — >► X which is also 
bijective. Finally, X ~ Y and Y ~ Z =>► X ~ Z, since if / : X -> Y and g :Y -> Z 
are bijective maps, then g o f : X — > Z is also bijective. □ 

Example 1.2.21 (i) Consider the concentric circles C\ = {(jt, y) g R x R : x 2 + 
y 2 = a 2 ) and C 2 = {(x, y) e R x R : x 2 + y 2 = b 2 } with center O = (0, 0), where 
0 < a < b and the function / : C 2 — > Ci, where f{x) is the point of intersection of 
Ci and the line segment from the center O to x e C 2 . Then / is a bijection. 

The bijection of sets yields equivalent sets. Here Ci and C 2 are equivalent as sets 
as shown in Example- 1 of Fig. 1.6. 

(ii) Let Coo = C U { 00 } be the extended complex plane and S 2 = {(x, y, z) € R 3 : 
x 2 + y 2 + z 2 = 1} be the unit sphere in R 3 . There exists a bijection / : S' 2 — >► Coo 
[Conway 1973] called stereographic projection , as shown in Example-2 of Fig. 1.6, 
and hence S' 2 and Coo are equivalent sets. Thus Coo is represented as the sphere S 2 
called the Riemann sphere. 
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Fig. 1.6 Example- 1 represents the equivalent sets C\ and C 2 and Example-2 represents stereo- 
graphic projection 


Definition 1.2.22 A set X is said to be finite iff either X = 0 or X ~ Z n for some 
positive integer n > 1 i.e., iff X and Z n have the same number of elements. 

Proposition 1.2.3 A finite set X cannot be equivalent to a proper subset of X. 

Proof It follows from Definition 1.2.22. □ 

Definition 1.2.23 A set which is not finite is said to be infinite. Z, N are evidently 
examples of infinite sets. 


1.3 Countability and Cardinality of Sets 

We are familiar with positive integers 1, 2, 3, ... , and in daily life we use them for 
counting. They are adequate for counting the elements of a finite set. Beyond the 
area of mathematics, all sets are generally finite sets. But in mathematics, we come 
across many infinite sets, such as set of all natural numbers, set of all integers, set 
of all rational numbers, set of all real numbers, set of all points in a square or in a 
plane etc. We now describe a method, essentially due to Cantor for counting such 
infinite sets by introducing the concepts of countability and cardinality. The concept 
of countability of a set is very important in mathematics, in particular, in algebra, 
analysis, topology etc. This concept has great aesthetic appeal and develops through 
natural stages into an excellent structure of thought. 

Definition 1.3.1 A set X is said to be countable iff either X is finite or there exists 
a bijection / : X N + . (In the latter case X is said to be infinitely countable.) 

Proposition 1.3.1 Let X be a countable set and f : X — > Y be a bijection. Then Y 
is also a countable set. 
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Proof Suppose A is a non-empty finite set. Then there is a bijection g : X — > Z n for 
some integer n > 1. Hence go / -1 : 7 -> is also a bijection and so 7 is finite 
and hence 7 is countable. If g : X — > N + is a bijection, then g o / -1 : 7 — >► N + is a 
bijection i.e., X is infinitely countable iff 7 is infinitely countable. □ 

Lemma 1.3.1 Every subset of N + is countable. 

Proof Let A be a subset of N + . If A is finite, the proof is obvious. Next let A be not 
finite. Then by well-ordering principle (see Sect. 1.4), A has a least element n\ (say). 
If {m} A, A — {n\} has a least element n 2 (say). If A^ {/i i } U {n 2 } = {^ 1 ,^ 2 }, 
the same process is continued to obtain 

A = {n\, n 2 , • • 

Clearly, there exists a bijection / : A -> N + by f(n r ) = r for r = 1, 2, 3, 

Consequently, A is countable. □ 

This result can be generalized as follows. 

Theorem 1.3.1 Every subset of a countable set is countable. 

Proof Let X be countable and 7 c X. If 7 = 0 or 7 is finite, the proof is obvious. 
We now consider the other case. As X is infinitely countable, there exists a bijec- 
tion / : X -> N + . We define a map g : 7 -> N + by taking g = / |7. Since / is a 
bijection and 7 c X, g is injective and hence g: 7 ^g( 7 )c N + is a bijection. As 
g(7) is countable by Lemma 1.3.1 and g - 1 :g(7)^7isa bijection and hence 7 
is countable by Proposition 1.3.1. □ 

If a set X is infinitely countable, then there exists a bijection / : X N + . The 
particular x, for which f(x) = n is called the suffix of the element v denoted by x n , 

for n = 1, 2, 3, Every element of X may thus be suffixed, and the elements of 

X can be arranged as jq, jc 2 , * 3 , . . . in order of the suffixes 1, 2, 3, The set X , 

thus arranged, is called a sequence , and is usually denoted by {x n }. 

Proposition 1.3.2 Every infinite set contains a countable subset. 

Proof Let X be an infinite set and x\ e X. Let v 2 e X — {vi}, X 3 e X — {x \ , x 2 }, . . . , 
x n e X — {x\, jc 2 , . . . , x n -\). Continuing this process, we obtain a sequence of dis- 
tinct elements x\, x 2 , . . . , x n , . . . , of A. Then X contains a countable subset {jq, 
x 2 , ...,x n , ...}. □ 

Theorem 1.3.2 77z£ union 0/0 countable aggregate of countable sets is countable. 

Proof Let X\, A 2 , . . . , be a countable aggregate of countable sets A/. Let A* = 
{jc/i, x/ 2 , . . . : i G N + }. Then the elements of their union A = Ai U A 2 U A 3 U • ■ • 
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can be arranged as a double array: 


* 11 , 

*12, 

*13, 

*21, 

*22, 

*23, 

*il, 

*z2, 

*z'3, 


where the elements of the i th set Xi are arranged in the i th row. Rearranging the 
elements we obtain a sequence: 

*11 ; *12, *21 ; *13, *22, *31 ; 

which consists of ‘blocks’ separated by semicolons, of elements xtj , where the nth 
block consists of all for which i + j = n + 1, with i increasing. In particular, 
an element occupies the /th position in the (/ + j — l)th block; the position 
being [{1 + 2 + ••• + (/ + j — 2)} + /]th from the beginning. Consequently, X is 
countable. □ 

Theorem 1.3.3 The set Q of all rationals is countable. 

Proof The set Q + of all positive rational numbers is countable by Theorem 1.3.2, 
since Q + is the union of the countable aggregate of countable sets X\, X 2 , . . . , 

X r , . . . , where X r is the countable set {n/r} for r = 1,2, Similarly, Q - of all 

negative rational numbers is also countable. Consequently, the set Q = Q + U {0} U 
Q - is countable. 

We now utilize the concept of countability to define a cardinal number. Let B 
be the collection of sets. Then the relation ~ (equivalent) is an equivalence relation 
on B (by Theorem 1.2.10). Consequently ~ partitions B into sets of disjoint equiv- 
alence classes; each class carries a special name, called a cardinal number. Every 
set belongs to a cardinal number; this is generally expressed by saying that a set 
has a cardinal number. Finite sets, which are equivalent have the same number of 
elements; hence the number of elements of a finite set may be taken as its cardi- 
nal number. The cardinal numbers of each of the sets 0,{1},{1,2 }, {a, b, c}, . . . are 
0, 1, 2, 3, . . . respectively. □ 

Cantor discovered that the infinite set of all real numbers is not countable. We 
prove this by using the following theorem. 

Theorem 1.3.4 The set X = (0, 1] of all positive real numbers < 1 is not countable. 

Proof Assume to the contrary, i.e., assume that X is countable. Then X = {x\, X 2 , 

. . . , x n , . . .}, say. Now each element x n e X can be expressed uniquely as a non- 
terminating decimal expression: x n = Q.x n \x n 2 X n ?> . . . , where x n i G {0, 1,2,..., 9}, 
in a non-trivial manner (i.e., when all x m -, for all i > some m are not zero; 
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a terminating decimal expression is replaced by a non-terminating one, viz., 
0.12 = 0.1 19999 . . . ). Then each element in X can be written as follows: 

x\ =0.xn Xi 2 X \3 . . . 

*2 = 0.X21 X22 *23 • • • 


%n — O.JCfti X n 2 X n i, . . . 


A non-terminating decimal expression y = 0.yiy2y3 • • • yn • • • , where y n is taken 
to be a non-zero digit different from x nn for n = 1, 2, 3, ... , does not enter in X , 
since y differs from every x n at the nth place after decimal. But y determines a real 
number e X. This contradicts that every real number r e X occurs in the sequence 
{xi, X 2 , . . • , x n , . . .}. Consequently, X must be non-countable. □ 

Corollary The set R of all real numbers is non-countable. 

We extend the notion of counting by assigning to every set X (finite or infinite) 
an object \X\, called its cardinal number or cardinality, defined in such a way that 
\X\ = \ Y\ if and only if X is equipotent to Y i.e., X ~ Y. This definition carries 
sense by Theorem 1.2.10. Thus if X is a finite set of n elements then \X\ = n. The 
concept of countability accommodates more infinite sets for determination of their 
cardinality; e.g., |N| = |Q|. The cardinal number d or c of an infinite set X asserts 
that the set is countable or not countable, respectively. 

The cardinal number of an infinite set is called a transfinite cardinal number. In 
particular, the cardinal number of N + , and hence also of every infinite countable set, 
is denoted by the symbol d (or kq i.e., aleph with suffix 0). Let the cardinal number 
of a set X be denoted by \X\. Then |N + | = d. As R is non-countable, |R| ^ d and 
|R| is denoted by c (or k\ i.e., aleph with suffix 1); and is called the power (or 
potency) of the continuum. In practice, infinite sets have cardinal number d or c. 

We now extend the concept of natural ordering of positive integers to the set of 
cardinal numbers. 

Let A and B be two sets such that | A \ = a and \B\= ft. Then we say that a < ft 
(equivalently > a) iff the set A is equipotent to a subset of B. Also, we say a < ft 
(or > a) iff a < /3 and a ^ ft hold. 

Proposition 1.3.3 d is the smallest transfinite cardinal number. 

Proof d < c by Proposition 1.3.2. But d / c by the Corollary of Theorem 1.3.4. □ 

The following theorems are used to show that the above relation ‘<’ is antisym- 
metric. 

Theorem 1.3.5 (Schroeder-Bernstein Equivalence Principle) Let X,X\ and X 2 be 
three sets such that X D X \ Z> X 2 and X ~ X 2 . Then X ~ X 1 , 
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Proof Since X ~ X 2 , there exists a bijection / : X — > X 2 . Furthermore, since X D 
Xi, /|Xi is injective and hence Xi ~ /(Xi) = X 3 (say) C X 2 . Recursively, let 
/(X r ) = X r+ 2 , for r = 1, 2, 3, Consequently, we have 

X D Xi D X 2 D X 3 D ■ ■ O X r D X r+i D X r+2 !)•••, 

where X ~ X 2 ~ X 4 ~ X§ ~ • • • and Xi ~ X 3 ~ X 5 ~ X 7 ~ • • • and hence X — 

Xi - X 2 - X 3 - X 4 - X 5 - X 6 - X 7 holds under /. Thus (X - Xi) U (X 2 - 

x 3 ) u (X 4 - X 5 ) U (X 6 - X 7 ) U (X 2 - X 3 ) u (X 4 - X 5 ) u (X 6 — X 7 ) u • • ■ 

holds under /. 

Let 


y = x n Xi n x 2 n x 3 n • • • . 

Then 

X = Y U (X - Xi) U (Xi - X 2 ) U (X 2 - X 3 ) U (X 3 - X 4 ) u • • • , 
and 

Xi = Y U (Xi - X 2 ) U (X 2 - x 3 ) U (X 3 - X 4 ) U (X 4 — X 5 ) u • • • . 

Define the map g : X Xi as follows: 

| f(x), if v e (X - Xi) U (X 2 - X 3 ) U (X 4 — X 5 ) U • • • , 
g{X) \ x, ifx e (Xi - X 2 ) U (X 3 - X 4 ) U (X 5 — X 6 ) U • • • or 1 e 7. 

Then g is a bijection. Consequently, X ~ Xi . □ 

Theorem 1.3.6 (Schroeder-Bernstein Theorem) If for given two sets X and Y , 
X - Yi C Y and Y ~ Xi C X, then X ~ Y. 

Proof Let / : X -> Y\ be a bijection and /(X 1 ) = F 2 . Then X\ ~ F 2 and Y\ D F 2 . 
Now Y D Fi D F 2 . Moreover, F ~ Xi (by hypothesis) and Xi ~ F 2 =>► F ~ F 2 . 
Then F - Fi by Theorem 1.3.5. Hence X - Y x and F - Y x X - F. □ 

Corollary If a and f are two cardinal numbers such that a < /3 and /3 < a, then 
a = f. 

Applications of Zorn’s Lemma and the Schroeder-Bernstein Theorem 

A(i) Let Abe a partially ordered set. Then 3 a totally ordered subset of A which is 
not a proper subset of any other totally ordered subset of A. 

Proof Let B be the class of all totally ordered subsets of A. Order B partially by set 
inclusion. We prove by Zorn’s Lemma (Lemma 1.2.1) that B has a maximal element. 
Let A = {Bi : i e 1} be a totally ordered subclass of B and X = (J{5 Z - : i el}. Then 
Bi C A VBi g A =>► X C A. We claim that X is totally ordered. Let jc, y e X. Then 
3Bj, Bk e A such that x e Bj, y e B^ for some j, k e I. As ^4 is totally ordered 
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by set inclusion, one of them is a subset of other. Without loss of generality, let 
Bj C Bk. Then v, y e Bk e A, a totally ordered subset of A. Hence either x < y 
or y < x. Then X is a totally ordered subset of A and hence X e B. As B( c X 
VBi e A, X is an upper bound of A. Thus every totally ordered subset of B has 
an upper bound in B. Hence by Zorn’s Lemma B has a maximal element. In other 
words, 3 a totally ordered subset of A which is not a proper subset of any other 
totally ordered subset of A. □ 

A(ii) Let {A z -} ie / be any infinite family of countable sets. If A = U/e/ then card 
A < card /, i.e., \A\ < |/|. 

Proof If I is countably infinite, then A being the countable union of countable sets 
is also countable by Theorem 1.3.2. If I is uncountable, then A can be represented 
by Zorn’s Lemma as the union of a disjoint family of countably infinite subsets. 
Hence card A < card I . □ 

B Let I = [0, 1] and I n = {(xi,X 2 , . . . , x n ) : JCj e I, for i = 1, . . . , n} be the n- 
dimensional unit cube. Then I ~ I n . 

Proof Define T = {(x, 0, 0, . . . , 0) e I n ). Then T c I n . Define a bijection /:/—>► 
T by f{x) = (x, 0, 0 . . . , 0). 

Hence / ~ T Cl n . 

Again for (jci , X 2 , . . . , x n ) e I n , Xi can be expressed uniquely as a non-terminating 
decimal expression: 


Xi = O.Xi\Xi 2 . . . , for i = 1, 2, . . . , n. 

Now define an injective map g : I n — >► I by 

g( 0.XUX12 • . . , 0.X21X22 O.X n iX n2 . . .) = O.XnX 2 l . . . *„1X12X22 • • • ^2 • • • • 

Then g yields a bijection from I n onto g(I n ) = V C I . 

Thus I n ~V Cl. 

By the Schroeder-Bernstein Theorem, (i) and (ii) yield I n ~ I . □ 

Remark The cardinal numbers of I n and I are the same. 

There is a question arising naturally: does there exist any transfinite cardinal 
number greater than cl The answer is affirmative. The following theorem prescribes 
a definite method for obtaining a set, whose cardinal number is greater than the 
cardinal number of the given set. 

Theorem 1.3.7 If A is a non-empty set and V (A) is its power set, then \A \ < \ V(A)\. 

Proof The map / : A — >► V(A) defined by f(a) = { a }, for all a e A is injective. 
Hence, \A\ < \T(A)\. But there is no bijection g : A ->► V(A) (see Ex. 8 of SE-I). 
Hence, \A\ \V(A)\ implies that \A\ < \V(A)\. □ 
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Using this theorem and the result in Ex. 14 of SE-I, the following corollary fol- 
lows. 

Corollary For any cardinal number a, a < 2 a , where a is the cardinality of some 
set A and 2 a is the cardinality of the set of all functions from A to {0, 1 }. In partic- 
ular, c <2°. 

Remark Like natural numbers, the law of trichotomy holds for cardinal numbers. 


1.3.1 Continuum Hypothesis 

We have shown the existence of three distinct transfinite cardinal numbers d, c and 
2 C such that d < c <2 C . 

We now state the following natural questions which are still unsolved: 

Unsolved Problem 1 Does there exist any cardinal number a such that d < a < cl 

Unsolved Problem 2 Does there exist any cardinal number p such that c 

Cantor’s continuum hypothesis stated by Georg Cantor in 1878 says that there is 
no cardinal number a strictly lying between d and c. Karl Godel showed in 1939 
that if the usual axioms of set theory are consistent, then the introduction of con- 
tinuum hypothesis does not make any inconsistency [see Cohn (2002)] . Again, the 
generalized continuum hypothesis asserts that there is no cardinal number p satis- 
fying a < P < 2 a , for any transfinite cardinal a. P.J. Cohen showed in 1963 that 
the generalized continuum hypothesis is independent of axioms of set theory [see 
Cohen (1966)]. 

Supplementary Examples (SE-I) 

1 Let A, B , C, and D be non-empty sets. Then 

(i) (A x B) H (C x D) = (A H C) x (B n D)\ 

(ii) (Ax5)cCxC=^ACC; 

(iii) AxB=BxAoA=B (see Proposition 1.1.5); 

(iv) A^B^ AxB^B x A. 

[Hint, (i) (a, b) e (A x B) Pi (C x D) => {a, b) e A x B and ( a , b) e (C x D) => 
a e A,b e B,a e C and b e D ^ a e AFC and b e B D D =>► (a, b) e (A D C) x 
(B n D) =► (A x B) n (C x D) C (A n C) X (B n D). 

Similarly, (< a , b) e (A D C) x (B D D) => ( a , b) e (A x B) Pi (C x D) => (A (T 
C) x (B (T D) c (A x B) (T (C x D). Consequently, (A x B) D (C x D) = (A (T 
C)x(BHD). 

(ii) Let b be an arbitrary element of B . Then aeA^(a,b)eAxB^CxC^ 
agC^AcC. 
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(iii) (a,b)eAxB = BxA^aeB and b e A Ac B and 5CA=^A = 5. 
The converse is trivial. 

(iv) Suppose A ± B. If possible A x B = B x A. Then (iii) yields A = B =>► a 
contradiction.] 

2 For each n e N+, let A n = [0, ±], Then f|£Li = {0}. 

[Hint. Let M = H^i A n - Clearly 0 e M. We claim that M = {0}. Otherwise, 
there is a real number x e M such that 0 <**h for all n e N + . Then n < l for all 
n e N + =>► a contradiction.] 

3 Let A, B be two given sets such that A C B. A relation p is defined on V(B) by 
(X,Y)epoXnA = YHA. Then 

(i) p is an equivalence relation; 

(ii) p class 0p of 0 is V (B — A); 

(iii) \A\=m^\V(B)/p\=2 m . 

4 (i) A relation p on R 2 is defined by ((a, b), ( x , y)) epOa~hy = b + x. Then 
p is an equivalence relation. 

(ii) Let p and [i be the relations on R defined by p = {(x,y) :x 2 + y 2 = 1} and 
/x = {(y, z) : 2y + 3z = 4}. Then /x o p = {(x, z) : 4x 2 + 9z 2 - 24z + 12 = 0} (by 
eliminating y from the above two equations). 

Remark Considering R 2 as the Euclidean plane, the p-class (< a,b)p , in 4(i), rep- 
resents geometrically the straight line having gradient 1 and passing through the 
point (a, b). 

5 Let M n (R) denote the set of nxn real matrices. Define a binary relation p on 
M n ( R) by (A, B) e p O A r = B l for some r, t e N + . Then p is an equivalence 
relation. 

6 For any real x, the symbol [x] denotes the greatest integer less than or equal to x 
(for existence of [x], see Ex. 5 of SE-II). 

(a) Let the relation p be defined on R by (xj)ep^ [x] = [y]. Then p is an 
equivalence relation and for each n e N + , np = [n, n + 1). 

(b) Let /, g : N + -> N + be the mapping given by /(x) = 2x, g(x) = [2(x + 1)]. 
Then / is injective but not surjective, and g is surjective but not injective. 

(c) Let h : N + Z be defined by h(n) = 

Then h is a bijection. 

7 Let the relation p be defined on N + x N + by (a, b)p(x, y) a + y = b + x. 
Then p is an equivalence relation. Define a map / : (N + x N + )/p — >► Z by 

f((a, b)p) = a — b, Va,be N + . 

Then / is well defined and bijective. 
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8 Let S' be a non-empty set. Then there exists no bijection / : S 

[Hint. If possible there exists a bijection / : S -> V(S). Define A by the 
rule: A = {x e S : x £ /(x)}. Then A e V(S). Clearly, / is a bijection 
there exists an s e S such that f(s) = A. Now the definition of A shows 
that if j e A, then s £ f(s) = A and conversely. This yields a contradic- 
tion.] 

9 Let A = {0, 1} and B, C be non-empty sets and / : B C be a non-surjective 
map. Then there exist distinct maps g,h : C -> A such that g o / = h o f. 

[Hint. Define g, h : C -> A by g(x) = 0 for all x e C and h(x) = 0 if x e Imf C 
C = 1 if x e (C - Imf) 0). Show that g ^ h but g o / = h o /.] 

10 If / : A g : 5 > C and h : B — > C are maps such that g o f = h o f and 
/ is surjective. Then g — h. 

[Hint. Let Z? be an arbitrary element of B . Then there exists some a e A such that 
b = f(a), since / is surjective. Now g(b) = g(f(a )) = (g o f)(a) = (h o f)(d) = 
h(f(a)) = h(b) => g = h (since g,h : B — > C and b is an arbitrary element 
of B).] 

11 (Ax5)xC = Ax(5xC)<^at least one of the sets A, B , C is empty. 
[Hint. If one of the sets A, B, C is empty, then by Proposition 1.1.3, (A x B) x 

C = 0 = Ax(SxC). Conversely, suppose (A x B) x C = A x (B x C). If none 
of the sets A, B, C is 0, then there exists at least one element (a, c) e (A x B) x C 
where a e A x B and c e C. By hypothesis (a, c) e A x (5 x C) =>- a e A => 
a contradiction.] 

12 (i) Any two open intervals ( a , Z?) and (c, d ) are equivalent; 

(ii) All the intervals (0, 1), [0, 1) and (0, 1] are equivalent to [0, 1]; 

(iii) Any open interval is equivalent to [0, 1]; 

(iv) R is equivalent to [0, 1]. 

[Hint, (i) The map / : (a, b) — >► (c, d ) defined by f(x) = c + %E§( X — ^) is a 
bijection. 

(ii) Let A = {x e (0, 1) : x cannot be expressed in the form ^ for integers n > 
1} = (0, 1) - {i, i, l . . .}. Hence (0, 1) = A U {£, .}. 

Then the function / : [0, 1] ->► (0, 1) defined by 

/(*) = — L ifx = I neN+ 

ft + 2 ft 

= - if x = 0 

2 

= x if x g A 


is a bijection. 
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The sets [0, 1] and [0, 1) are equivalent under the bijection g : [0, 1] —> [0, 1) 
defined by 

g(x) =x if x ^ , n e N + 

n 

= if x = - , n £ N + . 

n + 1 n 

(iii) Any open interval (a, b) is equivalent to (0, 1) by (i) and (0, 1) is equivalent 
to [0, 1] by (ii). Hence (< a , b) is equivalent to [0, 1] by Theorem 1.2.10. 

(iv) Consider the bijection / : ( — §, §) ~ ► R defined by /(x) = tanx. Then use 
(iii) and Theorem 1.2.10.] 


13 For any non-empty set A, the collection 13(A) of all characteristic functions on 
the power set of A is equivalent to the power set V(A). 

[Hint. Define the function / : V(A) 6(A) by /(A) = xx , the characteristic 
function of X , given by 

Xx:A^{0,l}; 


where 


XxCO = 1 ifxeX 

= 0 ifxeA-X = X'. 


Then / is a bijection.] 

14 Let / : A B be one-one. Then / induces a set function /* : 7^(A) — > P(6) 
such that /* is one-one. 

[Hint. Define /* : V(A) -> P(5) by / * (A) = /(A). 

If A = 0, 7^(A) = {0} and hence /* is one-one. 

If A / 0, P(A) has at least two elements. Let A, Y e V(A) be such that A ^ F. 
Then 3p e A such that p £ Y, (or p e F, p £ A). Thus /(/?) e /(A) and since 
/ is one-one, /(/>) £ /(F) (or /(/?) G /(F) but f(p) £ /(A)). Hence /(A) # 
/(F) => /* is one-one.] 

15 Let p be an equivalence relation on a non-empty set A. Then the natural pro- 
jection map p : A — > A/ p defined by /?(a) = ap = (( 2 ), the equivalence class of a 
under p is not injective but surjective. 

16 Let f : A B be a map and p be an equivalence relation on A defined by 
apb /(«) = f(b). Let / : A/p — > /(A) be the map defined by f(ap) = /(a). 
Then / is a bijection. 

[Hint, (i) / is independent of the choice of the representatives of the classes and 
hence / is well defined. Again for any x e /(A), 3a e A such that x = / (a). Then 
f(ap) = / (a) = x shows that / is surjective. Clearly, / is a bijection.] 
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Y Y 


Graph of p Graph of p _1 

Fig. 1.7 Graphs of p and p~ l 

17 Let p be the relation on R defined by xpy <^0<x — y <1. Express p and p -1 
as subsets of R x R and draw their graphs, where 

p = j(ij):xjeR, 0<x — y<l}cRxR 


and 


p 1 = {O, y) : (y, i)ep} = {(x, y) : x, y e R, 0 < y — x < 1} C R x R. 


[Hint. See Fig. 1.7.] 

18 Let p be a relation from a set A to a set B. Then p uniquely defines a subset p* 
of A x B as follows: 


p* = {(a, b) e A x B : apZ?} cAx5. 

Again any subset p* of A x 5 defines a relation p from A to 5 by apb 
(a, b) g p*. 


Domain of p = {& e A : (a, Z?) e p for some Z? e £ } , 

Range of p = {/? e 5 : (a, Z?) e p for some agA). 

Note that this correspondence between the relation p from A to B and subsets of 
A x B justifies Definition 1.2.1. 

Let p be a relation from A to B, i.e., p c A x B and suppose that the domain 
of p is A. Then 3 a subset /* of p such that /* is a function from A into 

Let B be the class of subsets / of p such that / c p is a function from a subset of 

A into B. Order B partially by set inclusion. If / : A\ — >► B is subset of g : A 2 — >► B, 

then A\ C A 2 . 
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Fig. 1.8 Modular but not 
distributive lattice 


1 



[Hint. Let A — {fi : A; £};<=/ be a totally ordered subset of B. Then / = 
U; fi is a function from |J. A; into Moreover, / C p. Hence / is an upper 
bound of A Then, by Zorn’s Lemma, Z> has a maximal element /* : A* — >► We 
claim that A* = A. If A* ^ A, 3a e A such that a £ A*. Since domain of p = A, 
3 an ordered pair (a, b) e p. Hence /* U {(a, /)} is a function from A* U {^} into 
This contradicts the fact that /* is a maximal element in Z>. Hence A* = A proves 
the result.] 

19 Let p be the relation on N + defined by (a, b) e p a < b. Hence (a, b) e 
p _1 o b < a. Then p o p -1 7 ^ p -1 o p. 

[///rtf. p o p -1 = {(x, y) g N + x N + : 3b g N + such that (x, Z) g p -1 , (Z, y) G 
p} = {(x, y) g N+ x N+ : 3Z g N+ such that / < x,Z < y} = (N+ - {1}) x 
(N + — {1}). 

Again 

p _1 o p = {(x, y) G N + x N + : 3b G N + such that (x, Z) G p, (Z, y) G p -1 } 

= { (x , y) G N + x N + : 3b G N + such that x < Z, y < b] 

= N + xN + .] 

20 The lattice given by the following poset is modular, but not distributive, as 
shown in Fig. 1.8. Because xv(yAz) = xvO = x/l = (xVy)A(xVz). 


1.3.2 Exercises 


Exercises-I 

1. Let U be the universal set and A, B, S, T, X are sets. Prove that 

(i) A H B c A and A n B C B \ 

(ii) UHA = A ; 

(iii) A H 0 = 0; 

(iv) (A-B) c A; 

(v) (A-£)n£ = 0; 

(vi) 5-ACA'; 

(vii) 5-A ; = SnA; 

(viii) A H (5 A C) = (A n B) A (A n C); 
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Fig. 1.9 Commutativity of 
the diagram 


A 


f 



A/p 


(ix) 

00 

(xi) 

(xii) 
(xiii) 

(xiv) 

(xv) 

(xvi) 
(xvii) 
(xviii) 

(xix) 


(A U 5) fl (A U B') = A; 

(A fl 5) U (A fl B') = A; 

(A U B)' = A'DB' and (A n B)' = A' U B' 

' v ' 

De Morgan’s Law. 

S c T =>• S U X c T U X and S n X c T n X for any set X; 

S c T and T c X =>- 5 c X; 

SCT<^Sn7’ = S; 

7’C5<^S = 7’U5; 

7’c^(X-r)UT = X; 

S x T = T x S AA either 5 = 7 or at least one of S’, T is 0; 

SxT = SxAa> either T = A or S' = 0; 

(5xf)xA = 5x(r x A) at least one of the sets S', T, A is 0. 


2. Prove the following: 

(i) If S is any set, then S x S is an equivalence relation on S. 

(ii) If p and a are symmetric relations on a set A, then p o a is also a symmetric 
relation onA<^poa = aop. 

(iii) If a relation p on a set A is transitive, then a -1 is also transitive on A. 

(iv) p is a reflexive and transitive relation on A =>► p Pi p -1 is an equivalence 
relation. 


3. If A and B are two sets and / : A — > B is a surjection, then show that 

(i) / induces an equivalence relation p on A; 

(ii) there is a surjection g : A — >► A/p; and 

(iii) there is a bijection h : B — > A/ p such that the diagram as shown in Fig. 1.9 
is commutative, i.e., ho f = g. 

4. If A, 5, C are sets, then show that 

(a) (i) A x B ^ B x A; 

(ii) (AxB)xC~AxBxC~Ax(Bx C); 

(b) (0,1) ~R. 


5. Let L be a lattice. If jc, y, z e L, then prove that 

(i) x A x = x and x V x = x ; 

(ii) x A y = y A x and x V y = y V x ; 

(iii) x A (y A x) = (x A y) A z and x V (y V z) = (x V y) V z\ 

(iv) (x A y ) V x = x and (x V y ) A x = x . 

6. Let L be a non-empty set in which two operations A and V are defined and 
assume that these operations satisfy the conditions of Exercise 5. A binary rela- 
tion < are defined onL:x< j<^xAj=x. Then prove that (L , <) is a lattice 
in which x A y and x V y are the gib and lub of x and y, respectively. 
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Fig. 1.10 Diamond lattice 


a 



c 


7. A lattice (L, <) is said to be complemented iff it contains distinct elements 0 
and 1 such that 0 < x < 1 for every x e L and each element x has a comple- 
ment x' with the property: x Ax' = 0 and xVx ; = 1 . Boolean algebra is defined 
to be a complemented distributive lattice. If B is a Boolean algebra, then prove 
that 

(i) each element has only one complement; 

(ii) 0 r = 1 and V = 0; 

(iii) x<y^ / <x'forx,ye5; 

(iv) (x A y)' = x' V y' and (x V yY = x'a / for all x,y e B. 

8. Prove the following: 

(a) If in a poset P, every subset (including 0) has a gib in P, then P is a 
complete lattice. 

(b) If a poset P has 1, and every non-empty subset X of P has a gib, then P is 
a complete lattice. 

9. If L is a lattice, prove that the following statements are equivalent: 

(i) L is distributive; 

(ii) ab + c = (a + c)(b + c); 

(iii) ab + be + ca = {a -\-b)(b + c)(c + <z), Wa, b,c e L, (where a Ab = ab and 
a v b = a + b). 

10. Give an example of a lattice which is modular but not distributive. Consider the 
diamond as shown in Fig. 1.10. This lattice is modular, as there is no pentagon. 
But it is not distributive, since a(b + c) = al = a and ab + ac = 0 ^ a(b + c) 
ab + ac. 

1 1 . Prove the following: 

(i) If A and B are two finite sets such that \A\ = m and \B\ = n, then | A x 
B | = mn; 

(ii) If \X\ =rc,then 

(a) the set of all possible mappings X — > X consists of n n elements; 

(b) the set of all bijections X — >► X consists of n \ elements (each bijection 
on X is called a permutation of A). 

12. Let a and /3 be cardinal numbers and A, B disjoint sets that \A \ = a, \B\ = 
Define a + ft = \A U B\, afi = \A x B\ and p 01 = \B A \ , the cardinal number of 
the set B a of all functions from A to B. Prove that for any cardinal numbers a , 
P and y , 
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(i) (a + p) + y = a + (0 + y); 

(ii) (up)y = a(Py); 

(iii) a + ft = P +a; 

(iv) aP = Pa\ 

(v) a(/J + y) — aP -hay; 

(vi) = a ^ +y ; 

(vii) (of^) y = aP y ; 

(viii) ( aP) y =a y P y . 


13. Show by examples that the cancellation law for the operations of addition and 
multiplication of cardinal numbers is not true. 

14. Let a = \A\ and P = \B\. Furthermore, let A ~ 5/ C5; i.e., suppose there is 
a mapping f : A ^ B which is injective. Then write A < B which reads ‘A 
precedes B \ and a < P which reads ‘a is less than or equal to p\ 

Suppose A < B means A < B and A^ B \ a < P means a < P and a ^ p. 
Then prove the following: 

(i) the relation on sets defined by A < B is reflexive and transitive, and the 
relation on the cardinal numbers defined by a < P is also reflexive and 
transitive; 

(ii) for any cardinal number a, a <2 a (Cantor’s Theorem); 

(iii) for cardinal numbers a and P, a < P and P < a =>> a = p. 


1.4 Integers 

We are familiar with positive integers as natural numbers from our childhood 
through the process of counting and start mathematics with them in an informal 
way by learning to count followed by learning addition and multiplication, the latter 
being a repeated process of addition. The discovery of positive integers goes back to 
early human civilization. Leopold Kronecker (1823-1891), a great mathematician, 
once said 4 God made the positive integers, all else is due to man\ The aim of this 
section is to establish certain elementary properties of integers, which we need in 
order to develop and illustrate the materials of later chapters. Moreover, some re- 
sults on theory of numbers are proved in Chap. 10 with an eye to apply them to 
Basic Cryptography (Chap. 12). 

G. Peano (1858-1932), an Italian mathematician, first showed in 1889 that the 
positive integers can be defined formally by a set of axioms, which is now known as 
Peano ’s axioms , which form the foundation of arithmetic. 

Peano’s Axiom A non-empty set N + , having the following properties is called 
the set of natural numbers and every element of N + is called a natural number. 

(1) to every n e N + , there corresponds a unique element s(n) e N + , called the 
successor of n and n is called the predecessor of s(n)\ 

(2) s(n) = s(m) implies n=m\ 
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(3) there exists an element of N + , denoted by 1 , which is not the successor of any 
element of N + , i.e., 1 has no predecessor; 

(4) if A c N + is such that IgA, and n g A implies s(n ) e A, then A = N + . 

The axiom (4) leads to one of the important principles of mathematics known as 
the ‘ Principle of Mathematical Induction’ i.e., if p is a property such that 1 has the 
property p and whenever a particular natural number n has the property p, implies 
s(n) has also the property, then every natural number has the property p. 

Definition 1.4.1 For n,m e N + , we define 

(a) addition ‘+’ recursively by n + 1 = s(n), n + s(m) = s(n + m)\ 

(b) multiplication *•’ by n • 1 = n, n • s(m) = n • m + n; 

(c) order relation > (or <): m is said to be greater than n denoted by m > n (or 

equivalently, n is said to be less than m denoted by n < m) iff m = n + p for 

some p g N + . 

Then some basic properties of natural numbers follow: 

Proposition 1.4.1 For all m,n, p G N + , we have 

(i) associative property of addition: (m + n) + p = m + (n + p)\ 

(ii) associative property of multiplication: (m • n) • p = m • (n • p); 

(iii) commutative property of addition: m + n = n + m; 

(iv) commutative property of multiplication: m -n=n • m; 

(v) cancellation property of addition: m + n = m + p implies n = p; 

(vi) cancellation property of multiplication: m • n = m • p implies n — p\ 

(vii) left distributive property: m • {n + p) = m • n + m • p\ 

right distributive property: (n + p) • m =n • m + p • m; 

(viii) property of trichotomy: Exactly one of the following holds : 

m > n, m < n , m — n. 

In respect of subtraction and division in N + , if m > n, then m — n is defined to 

be the positive integer p such that m = n + p. Thus m — n has a meaning in N + 
iff m > n. Similarly, if there is a positive integer p such that m =np, then m/n is 
defined to be the positive integer p. Thus m/n has a meaning iff n is a factor of m. 
Clearly, for any two positive integers m and n we cannot always define m — n or 
m/n in N + . These difficulties are overcome by extending the set N + to the set of 
integers Z and to the set of rational numbers Q, respectively. 

Remark If we define Q as the set S of all numbers of the form m/n , where m and n 
are integers and w/0, then each element of Q is represented by an infinite number 
of elements of S. For example, 3/4 = 6/8 = 12/16 =••• . While doing mathemat- 
ics, we identify all the elements of S that represent the same rational number by an 
equivalence relation ~onS:< 2 /Z?~c/dif and only if ad = be. Each equivalence 
class of S is considered to be a rational number. 



1.4 Integers 


41 


We now extend the set Q of rational numbers to a larger set containing Q in 
which all the properties of Q along with an additional property, called order com- 
pleteness property (i.e., every bounded set has the greatest lower bound and the least 
upper bound, which is analogous concepts of the greatest common divisor and least 
common multiple in the theory of divisibility). This extended set denoted by R is 
called the set of real numbers. We have already used the term Euclidean line without 
any explanation. We use the letter R (or R 1 ) to denote an ordinary geometric straight 
line whose points have been identified with the set R of real numbers. We use the 
same letter R to denote the real line and the set of real numbers and say that a real 
number corresponds to a unique point on the line and conversely. We assume that 
the reader is familiar with the properties of the real line, such as inequality of real 
numbers, usual addition, subtraction, multiplication, division, and lub. We use these 
properties in subsequent chapters to prove many interesting results. Apart from this 
approach, there are essentially two methods of constructions of R. 

1. Dedekind’s Method of completion by cuts and 

2. Cantor’s Method of sequences. 

Clearly, N+ g Z C Q C R. 

In the rest of the section we study the properties of integers only and we discuss 
some of the important tools that are useful for proving theorems. We assume that 
the reader is thoroughly familiar with the set Z of integers, the set N + of positive 
integers and elementary properties of addition, multiplication, and order relation on 
Z and N + . We begin by stating an important axiom, the well-ordering principle. 

Principle of Well-Ordering Every non-empty subset S of N + contains a least 
element b e S (i.e., 3 an element b e S such that b < a for all a e S). 

Next we shall show a close connection between the principle of well-ordering 
and the principle of mathematical induction. 

Theorem 1.4.1 The principle of well- ordering implies the principle of mathemati- 
cal induction. 

Proof Let S be a non-empty subset of N + such that 1 e S and every n e S implies 
n + 1 e S. We claim that S = N + . If N + — 5^0, then by well-ordering principle, 
N + — S contains a least positive integer n > 1 (as 1 g N + — S ). Consequently, 
n — 1 is a positive integer such that n — 1 £ N + — S. As a result n — 1 e S. Then 
n = (n — 1) + 1 g 5 by hypothesis. But this is a contradiction. Therefore, N + — S = 0 
and hence N + = S. □ 

Remark We now show that the principle of mathematical induction with usual order 
relation ‘<’ on N + implies the principle of well-ordering. 

Proof Let S be a non-empty subset of N + and T = {n e N + : n < p for every p in S} 
(usual order relation *<’ on N + is assumed). Then 1 e T, so T ± 0. Again T / N + , 
because seS^s-\-l£T (since s + 1 s). Then there exists a natural number 
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q e T such that q + 1 g T, otherwise, it follows by the principle of mathematical 
induction that T = N + . Now by definition of T,q < every p e S. We claim that 
q e S. If not, then q £ S =>► q < p,Vp e S. Thus for every p e S, p = q + r, so that 
p = q + 1 (when r — 1) or p = q + s(v) 9 where v is the predecessor of r and s(v) 
denotes the successor of v. Consequently, q + 1 < p for every peS^q + leT, 
a contradiction. As a result q £ S and g is a least element in S. Moreover, from the 
antisymmetric property of the relation ‘<’ it follows that q is unique. □ 

We now present “division algorithm ” for integers. An algorithm means a proce- 
dure or a method. The usual method of dividing an integer a by a positive integer 
b gives a quotient q and a remainder r, which may be 0 or a positive integer less 
than b. This concept, essentially due to Euclid, is now called ‘Division Algorithm’. 
A geometrical proof is given first, followed by an analytical proof by using the 
‘well-ordering principle’. 

A geometrical proof of division algorithm is given when a and b are positive 
integers. 

Theorem 1.4.2 (Division algorithm) Let a, b e N + and b ^ 0. Then there exist 
unique integers q and r such that 

a = bq + r, and 0 <r<b. 

Geometrical Proof Consider all positive multiples of b and plot them on the real 
line. Then the integer a lies at some point qb or between two points qb and (q + l)b. 
In the first case, a — qb = r = 0. In the second case qb < a < (q + 1 )b. Hence 
0 < r = a — qb < ((q + 1 )b — qb) = b implies a = qb + r, where 0 <r <b. □ 

A more general form of Division algorithm is now presented. 

Theorem 1.4.3 (Division algorithm) Let a,b e Z and b ^ 0. Then there exist 
unique integers q and r such that 

a = bq-\-r, and 0<r<|/?|. 

Proof We first prove the existence of q and r. If a = 0, then q = 0 and r = 0. So 
we assume that a 0. 

Case 1 . Suppose b > 0. Then b > 1 . If b = 1 , then g = and r = 0. If a = /?m for 
some me Z, then q = m and r = 0. For the remaining possibility, consider the set 

S' = {a — bm : m e Z and a — bm > 0}. 

Since b > l, a — (—\a\)b = a + \a\b > 0 and hence a — (—\a\)b e S. Conse- 
quently, S / 0. Hence by the well-ordering principle S has a least element r. Thus 
there exists an integer q such that r = a — bq > 0. Hence a = bq + r, 0 < r. We 
now prove that r < b. Suppose r > b. Then r = b + c for some integer c such that 



1.4 Integers 


43 


0 < c < r. Hence c = r — b = a — bq — b — a — b(q + 1 ) e S. This contradicts the 
minimality of r in S. Hence r < b. 

Case 2 . Suppose b < 0 . Since — b > 0 , by case 1 , there exist q ' , r in Z such that 

a = (~b)q f + r, 0 <r<—b= \b\. 

Taking q = —q', it follows that a = bq + r, 0 < r < \b\. 

From the above two cases we find that there exist q and r in Z such that 

a = bq+r, 0 <r<|/?|. (i) 

Suppose now a = + ri , where 0 < r\ < |Z? | for some q \ , r\ eZ and a = bq 2 + 7*2, 

where 0 < r2 < |Z?|, for some q 2, 7*2 e Z. 

Then — ^2) = (7*2 — n). Hence \b\\qi — q2\ = |t*2 — r \ |. If r2 — n 7^ 0 , then 
|Z?| < 1 7*2 — 7* 1 1 . Since 0 < ri < \b\ and 0 < r2 < \b\, we have |t*2 — r\ \ < \b\. This 
contradiction implies that r\ =7*2. Now from b{q2 — q\) = 0 , (Z? / 0 ), it follows that 
q 2 — qi = 0 . Hence r\ = 7*2 and q\=q2^ uniqueness of q and r in (i). □ 

Definition 1.4.2 A non-zero integer b is said to divide (or b is a factor of) an integer 
a iff there exists an integer c such that a = be. If b divides a , then we write it in 
symbol b\a. In case b does not divide a , we write b\a. 

Some immediate consequences of this definition are listed below: 

Theorem 1.4.4 Let a, b, c be integers. 

( 1 ) IfO ^ a, then a\ 0 , a\a, l\b. 

( 2 ) IfO fz a, 0 c, then a\b implies ac\bc. 

( 3 ) IfO zf a, 0 =fb, then a\b and b\c imply a\c. 

( 4 ) IfO a , then a\b and a\c imply a\(bn + cm) for every n, m, e Z. 

Proof Left as an exercise. □ 

We now introduce the notion of greatest common divisor. 

Definition 1.4.3 Let a, b be two integers, not both zero. A non-zero integer d is 
said to be a greatest common divisor of a and b iff the following hold: 

( 1 ) d\a and d\b ( d is a common factor of a and b)\ 

( 2 ) if c / 0 be an integer such that c\a and c\b, then c\d. 

A natural question to ask is whether the elements a,b e Z can possess two differ- 
ent greatest common divisors. Suppose d\ and J2 are two greatest common divisors 
of a , b. Then from the definition, d\ | J2 and J2M1 • Hence d\ = ±d2- This shows that 
one of d\ and J2 is a positive integer. Then one will be the negative of the other. The 
positive one is called the positive greatest common divisor. For two integers a , b , 
the positive greatest common divisor , if it exists, is denoted by ged (a, b) or simply 
by (a, b). 
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Theorem 1 . 4.5 If a and b are integers , not both zero , then gcd(a, b) exists uniquely 
and gcd(a, b) is expressible in the form 

gcd(a, b) = pa + qb for some p,q eZ. 

Proof Let S = {ma + nb : m,n e Z}. Since a and b are not both zero, S con- 
tains non-zero integers. Suppose 0 ma + nb e S'. Then — (ma + nb) = (—m)a + 
(-n)b e S. Since ma + nb and — (ma + nb) both belong to S , we find that S con- 
tains positive integers. Let S + be the set of all positive integers of S. Since S + / 0, 
by the well-ordering principle , it follows that S + contains a least positive integer d. 
Hence d < t, for all t e S + and d = pa + qb for some p,q e Z. We show that 
d = gcd(a, b). From the division algorithm there exist integers c and r such that 

a — c(pa + qb) + r, 0 < r < pa -\- qb = d. 

Suppose r > 0. Now r = (1 — cp)a + ( —cq)b shows that r e S + . As 0 < r < d, 
it contradicts the fact that d is the least element in S + . Hence r = 0. Consequently, 
(pa + qb) \a. Similarly, we show that (pa +qb) \b. Next, assume that a is a non-zero 
integer such that u \ a and u\b. Then from Theorem 1 .4.4 it follows that u \ (pa + qb). 
Hence pa + qb is a greatest common divisor of a, b. But pa + qb is positive. Con- 
sequently gcd(a, b) exists and gcd(a, b) = pa + qb. The uniqueness of gcd(a, b) 
follows from the antisymmetric property of the divisibility relation on N + (see Ex- 
ample 1.2.1 l(iii)). □ 

Corollary Let a and b be two integers. Then gcd (a, b) = 1 if and only if there exist 
integers u and v such that au + bv = 1. 

Proof First let gcd(a, b) = 1. Then by Theorem 1.4.5, there exist integers u and 
v such that au + bv = 1. Conversely, let there exist integers u and v such that 
au + bv = 1. Let d = gcd(a, b). Then by definition of greatest common divisor, d 
divides both a and b and hence au -\-bv. Thus d divides 1, resulting in d = 1. □ 

Remark If gcd(a, b) = pa + qb, for p,q e Z, then p , q may not be unique. For 
example, gcd(1492, 1066) = —5 • 1492 + 7 • 1066. So here p = — 5 and q =1 . Now 
we may take, p' = p -\- 1066 = — 5 + 1066 = 1061 and q' = q — 1492 = 7 — 1492 = 
-1485. Then 1492// + 1066^ = 1492/? + 1492 • 1066 + 1066^ - 1066 • 1492 = 
1492 p + 1066g = gcd(1492, 1066). 

We now list some of the interesting properties of gcd (a,b). All these can be 
proved easily. 

Proposition 1 . 4.2 If a, b,c are any three non-zero integers , then 

(i) gcd(a, gcd(Z?, c)) = gcd(gcd (a, b), c); 

(ii) gcd(a, 1) = 1; 

(iii) gcd (ca, cb) = cgcd(a, b ); 
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(iv) gcd (a, b) = 1 aftd gcd(a, c) = 1 imply that gcd(a, be) = 1; 

(v) a|c, Z?|c ftftd gcd(ft, b) = 1 imply that ab\c. 

Definition 1.4.4 An integer /? > 1 is called a prime integer iff only factors of p 
are ±1 and ±p in Z, otherwise is said to be a composite number. Two integers a 
and b are called relatively prime (or co-prime) iff gcd (a, b) = 1. 

The following theorem gives an alternative definition of a prime integer. 

Theorem 1.4.6 An integer p > 1 is a prime integer ijfVa, be Z with p\ab implies 
either p\a or p\b. 

Proof Suppose p is a prime integer and p\ab. We want to show that either p\a or 
p\b. Suppose p]a. Since p is prime, we know that 1 and p are the only positive 
divisors of p. Hence gcd(p, a) = 1. Then Theorem 1.4.5 asserts that 

1 = cp 4- da for some integers c and d. 

Hence b = 1 • b = cpb dab. Now p\p and p\ab. Consequently p\cpb and p\dab. 
Then p\(cpb + dab). This proves that p\b. 

Conversely, assume that p > 1 is an integer such that p\ab (a, b e Z) implies that 
p\a or p\b. We now prove that p is prime. Suppose 0 ^ m and m\p. Then p = mn 
for some n e Z. Now p\p. Hence p\mn. Then either p\m or p\n. 

Suppose p\m. Then there exists d e Z such that m = pd. Then p = pdn. This 
implies dn = 1. Since d, ft are integers, we find that d = 1 and ft = lord = — 1 and 
ft = — 1. Now from p = /ft ft it follows that m = ±p. 

Again, if we assume that p \n , then proceeding as above we can show that n = =b/? 
and then /ft = ±1. Consequently ±1 and ±p are the only factors of p. Asa result p 
is a prime integer. □ 

Corollary Let n be a positive integer and a\, a 2 , . . . , a n be integers such that 
p\a\a 2 • " a n , where p is prime. Then p\ai for some i such that 1 <i < ft i.e., p 
divides at least one ofa\,a 2 ,...,a n . 

Proof We prove this corollary by using principle of mathematical induction on n. 
If ft = 1, the result is trivially true. Assume that the result is valid for some m such 
that 1 < /ft < ft. Let p\a\a 2 • • • a m a m +\ . Then from Theorem 1.4.6 either p\a m +\ or 
p\a\a 2 • • • a m . If p\a\a 2 • • • a m , then from our hypothesis p\ai for some i such that 
1 <i <m. Hence either p\a m +\ or p divides at least one of a\, < 22 , • • • , a m - Now the 
result follows by the principle of mathematical induction. □ 

Remark As a generalization of Theorem 1.4.6 one can prove the following. Let a , 
b and m^f 0 be integers such that gcd(< 2 , /ft) = 1 and m\ab. Then m \b. 

The prime numbers form blocks of integers. First we shall show that there exist 
infinitely many primes. To prove this we use the following lemma. 
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Lemma 1.4.1 If a > 1 is an integer . ; then there exists a prime integer p such 
that p\a. 

Proof Let S = {b e Z : b is an integer > 1 and b\a}. Now a e S and then S 0. By 
well-ordering principle, there exists an integer p in S such that p < q for all q e S. 
This clearly shows p > 1 and p\a. Suppose p is not a prime integer. Then p has a 
factor m such that 1 < m < p. Now m\p and p\a. Hence m\a. As a result m e S. 
This contradicts the minimality of p in S. Consequently p is a prime integer. □ 

We are now in a position to prove the following celebrated result. 

Theorem 1.4.7 (Euclid) There are infinitely many prime integers. 

Proof It is true that there are prime integers such as: 2, 3, 5, 7. Suppose there is only 
a finite number of distinct primes, say p\, p 2 , . . . , p n . Consider the positive integer 


m = P\P2'"Pn + 1. 


Now m = p\(p2‘ • • p n ) + 1 shows that p\\m. Similarly we can show that none 
of P2, P3, • • • , Pn divides m. Since m > 1, Lemma 1.4.1 asserts that there exists a 
prime integer p such that p \m. This prime p is clearly different from p \ , p 2 , . . . , p n • 
Consequently, there is no finite listing of prime integers. In other words, there exist 
infinite number of prime integers. □ 

We shall prove a little later another celebrated theorem known as the fundamental 
theorem of arithmetic which proves that primes are indeed the building blocks for 
integers. For this purpose we introduce the concept of factorization of integers. 

Definition 1.4.5 An integer n > 1 is said to be factorizable in N + iff there exist 
prime integers p \ , p 2 , . . . , Pk such that n — p\ p 2 • • • Pk- 

An integer n > 1 is said to be uniquely factorizable in N + iff the following two 
conditions hold: 

(i) n is factorizable; 

(ii) if n = p\P2' " Pr = q\q2" - qste two factorizations of n (as a product of prime 
integers), then r = s and the two factorizations differ only in the order of the 
factors. 

We now prove the following basic result. 

Theorem 1.4.8 (The Fundamental Theorem of Arithmetic) Every integer n > 1 is 
uniquely factorizable (up to order). 

We prove this result by using the second principle of mathematical induction 
which is stated as follows. 
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Lemma 1.4.2 (Second Principle of Mathematical Induction or Principle of Strong 
Mathematical Induction) Let S be a non-empty subset of N + such that 

(i) some fixed positive integer no e S and 

(ii) for each positive integer m > no, each of no + 1, no + 2, . . . , m — 1 e S implies 
m e S. 

Then S = {n e N + : n > no}. 

Proof Let X = N + — {1, 2, . . . , no — 1} and T = X — S. If we can show that T = 0, 
then the lemma follows. Suppose 7/0. Then T is a non-empty subset of N + . Thus 
by the well-ordering principle T contains a least element, say m. As m e T , m £ S 
and m g {1, 2, . . . , no — 1}. Now by (i), m / no- For every x satisfying no < x < m, 
x £ T = X — S = X D S', where S' is the complement of S in N + . This implies 
x e X' U S. As X' = {1, 2, . . . , no — 1}, this implies x e S. Thus by (ii), m e S. This 
is a contradiction. Consequently T = 0 and then S = {n e N + :n> no). □ 

Corollary Let S be a subset of N + such that 

(i) 1 e S, and 

(ii) ifn > 1 and n e S for all n cm, implies m e S. 

Then S = N + . 

Proof of the Fundamental Theorem of Arithmetic Let n > 1. We first show that n is 
factorizable (as a product of prime integers). We prove this by the second principle 
of induction. If n = 2, then there exists a prime integer p\ = 2 such that n — p\ 
(=2). Suppose that any integer r,2<r<n can be expressed as a product of primes. 

Consider now the integer n. From Lemma 1.4.1 there exists a prime integer p\ 
such that p\\n. Then we can write n = p\n\ for some integer n \ . Since n > 2 , p\ > 
1, we must have n\ > 1. If = 1, then n = p\. Suppose 7 ^ 1. Then 2 <n\ <n 
and from the induction hypothesis we can write, 

n\ = P 2 P 3 - - • Pt for some primes P 2 , p?>, . . . , p t - 

Hence n = p\n\ = P 1 P 2 P 3 ■ ■ ■ Pt • 

Thus from the second principle of mathematical induction, it follows that any 
positive integer n> 2 can be expressed as a product of prime integers. 

We now show that the factorization of a positive integer n > 1 is unique up to 
the rearrangement of the order of the factors. We prove this also by the method of 
induction. If n = 2, then the factorization of n is unique. Assume that the uniqueness 
of factorization is true for any positive integer m such that 2 < m < n. Consider the 
integer n (>2). Suppose n — p\P2 • • • p s = • • • qt be two factorizations of n 

(as a product of prime integers). Since p\(p2" - Ps) = #i#2 ■ ■ it follows that 
pi \q 1 q 2 • • • Qt and hence p\ divides some qi (1 <i<t). But 1 and qi are the only 
positive divisors of qt. Consequently p\ = qi. We may assume that the qfi s are so 
arranged such that q\ = p\. Thus 


P\P 2 "'Ps = Piq 2 ---qt- 
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Since p\ ^ 0, we may cancel p\ and get p 2 P 3 • • • p s = ^2^3 • • • qt — n\ (say). But 
1 <n\ < ft. Then by induction hypothesis it follows that n\ is uniquely factorizable. 
Hence s — l = t — 1 and that the factorization ^ 2^3 • • • q s is just a rearrangement of 
the pi's, i = 2,3, . . . , s. Thus s = t and since p\ = q\ it follows that n is uniquely 
factorizable. Hence by induction we find that any integer n > 1 is uniquely factor- 
izable. □ 


From the fundamental theorem of arithmetic, we can write n > 1 as a product 
of primes and since the prime factors are not necessarily distinct, the result can be 
written in the form 


where p\ , p 2 , . . . , p r are distinct primes and ct\,ct2, • • • ,&r are positive integers. In 
turns out that the above representation of n as a product of primes is unique in the 
sense that, for a fixed n (>1), any other representation is merely a permutation of 
the factors. 


Remark We usually take p\ > p 2 > • • • > p r - 

Alternative Form of the Second Principle of Mathematical Induction Let 

P{n) be a statement which makes sense for any positive integer n > no. If the fol- 
lowing two statements are true: 

(a) P (no) is true and 

(b) for all k > no, each of P(no + 1), P(no + 2), . . . , P(k — 1) is true implies P(k) 
is true, 

then P(n) is true for all ft > ftq. 


1.5 Congruences 

The language of congruences plays an important role not only in number theory but 
also an important topic of abstract algebra. It was developed at the beginning of the 
nineteenth century by Karl Friedrich Gauss (1777-1855). Here we study modular 
arithmetic, that is, arithmetic of congruence classes where we simplify number the- 
oretic problems by replacing each integer by its remainder when divided by some 
fixed positive integer n. 

Definition 1.5.1 If a and b are integers, we say that a is congruent to b modulo a 
positive integer n, denoted by a = b (mod ft), iff n\(a — b). If n\{a — b) we write 
a ^ b (mod ft), and say that a and b are incongruent modulo ft. 

Example 1.5.1 We have 3 = 23 (mod 10), as 10 divides 3-23. Similarly, 2 = 
5 (mod 3). 

The following proposition provides some important properties of congruences. 
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Proposition 1.5.1 Let n be a positive integer. Then the congruence modulo the 
positive integer n satisfies the following properties : 

(i) reflexive property: If a is an integer ; then a = a (mod ft); 

(ii) symmetric property: If a and b are integers such that a = b (mod ft), then 
b = a (mod ft); 

(iii) transitive property: If a, b, and c are integers with a = b (mod ft) and b = 
c (mod ft), then a = c (mod ft). 

From Proposition 1.5.1, we see that the set of integers is divided into n different 
sets called congruence classes modulo ft, each containing integers which are mutu- 
ally congruent modulo ft. The reader may also revisit Examples 1.2.3 and 1.2.9. 

Arithmetic with congruences is very common in number theory. Congruences 
(=) have many of the properties that “equalities” (=) have. The following proposi- 
tion states that addition, subtraction, or multiplication to both sides of a congruence 
preserve the congruence. 

Proposition 1.5.2 If a, b , c are integers and n is a positive integer such that a = 
b (mod ft), the following properties are satisfied : 

(i) a + c = b + c (modft); 

(ii) a — c = b — c (modft); 

(iii) ac = be (modft). 

A very natural question that comes to our mind is: what happens when both sides 
of a congruence are divided by a non-zero integer? Let us consider the following 
example. 

Example 1.5.2 Let us consider 14 modulo 6. We have 7 • 2 = 4 • 2 (mod 6). But 
7^4 (mod 6). 

This example shows that it is not necessarily true that it preserves a congruence 
when we divide both sides by an integer. However, the following proposition gives a 
valid congruence when both sides of a congruence are divided by the same integer. 

Theorem 1.5.1 If ft, b , c, and n are integers such that n > 0, d = ged (c, ft), and 
ac = be (modft), then a = b (mod ft/d). 

Proof As ac = be (modft), there exists an integer t such that c(a — b) = nt. This 
implies that c(a — b)/d = nt /d. Again, ged (c/d, n/d) = 1 implies that n/d divides 
(ft — b). Hence a = b (modft/d). □ 

Example 1.5.3 Let us take the example of 65 modulo 15. We have 65 = 
35 (mod 15). As gcd(5, 15) = 5, by the above theorem we have 65/5 = 
35/5 (mod 15/5) which implies 13 = 7 (mod 3). 

The following corollary is immediate. 
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Corollary If a, b , c and n are integers such that n > 0, gcd (c, ft) = 1 aftd ac = 
be (mod ft), then a = b (mod ft). 

The following proposition, which is more general than Proposition 1.5.2, is also 
useful. 

Proposition 1.5.3 If a, b, c, d are integers and n is a positive integer such that 
a = b (mod ft) and c = d (mod ft), the following properties are satisfied: 

(i) a + c = b + d (modft); 

(ii) a — c = b — d (modft); 

(iii) ac = bd (modft). 

The following theorem shows that a congruence is preserved when both sides are 
raised to the same positive integral power. 

Theorem 1.5.2 If a, b are integers and n and k are positive integers such that 
a = b (modft), then a k = b k (modft). 

Proof As a = b (mod ft), n divides a — b. Now, a k — b k = (a — b)(a k ~ l + a k ~ 2 b + 

b ab k ~ 2 + b k ~ l ) implies that a — b divides a k — b k and hence n divides a k —b k . 

Thus a k = b k (modft). □ 

Supplementary Examples (SE-II) 

1 (i) The integer 3 2n — 1 is divisible by 8 Vft > 1 ; 

(ii) 3 2n = 1 (mod 8) Vft > 1 ; 

(iii) 3 2n — 1 = 0 (mod 8) Vft > 1. 

[Hint, (i) Let P(n) be the statement that the integer 3 2n — 1 is divisible by 8 
Vft > 1. Then P(l) is true, since 9 — 1 = 8 is divisible by 8. Next suppose that 
P(n) is true for some k e N + i.e., 3 lk - 1 is divisible by 8. Now 3 2( ^ +1) - 1 = 
3 2 ^3 2 — 3 2 + 3 2 — 1 = 3 2 (3 2k — 1) + (3 2 — 1) is divisible by 8. Hence by the principle 
of mathematical induction, P(n) is true Vft e N + . 

(ii) follows from (i). 

(iii) follows from (ii).] 

2 (i)2">l+ft VfteN+; 

(ii) 2 n > 2ft + 1 for all integers n > 3. 

[Hint, (i) Let P(n) be the statement: 2 n > 1 + n Vft > N + . Then P(l) is true. 
Next suppose that P(k) is true for some integer k > 1 i.e., suppose that 2 k > 1 + k 
for some integer k > 1. Then it follows that 2 k+l = 2 k 2 > (l + k)2 = 2 + 2k > 1 + 
(k + 1) => P(k + 1) is also true. Hence by the principle of mathematical induction, 
P(n) is true Vft e N + . 

(ii) Let P(n) be the statement: 2 n > 2n + 1 V integers n >3. Then 2 3 = 8 > 
2.3 + 1 =>► P(3) is true. Next let P(n) be true for some n >3 i.e., 2 n > 2n + 1 for 
some integer n>3. Now 2 n+l = 2 n • 2 > (2 n + 1) • 2 > 2 (n + 1) + 1 =>► P(n + 1) is 
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true. Hence by the principle of mathematical induction, P(n) is true for all integers 
ft >3.] 

3 Let m,n e N + . Then 

(i) n^m + ft ; 

(ii) n = mn O m = 1 . 

[Hint, (i) Let P(n ) be the statement that n / m + n for arbitrary m e N + and 
every n e N + . Then by Peano’s axiom (3), 1 ^ m + 1 =>► P(l) is true. Next let 
P(n) be true for some n e N + and arbitrary m e N + . Then n ^ m + n => s(n) ^ 
s(m + n) =>► n + 1 / m + (ft + 1) => P(n + 1) is true. Hence by the principle of 
mathematical induction P(n) is true for arbitrary m e N + and every n e N + i.e., 
P(n) is true Vm, n e N + . 

(ii) n = 1 • n proves the sufficiency of the condition. To prove the necessity of 
the condition, let m ^ 1. Then 3 some c e N + such that s(c) = m. Hence mn = 
s(c)n = (c + 1 )n = cn + n. Hence for m ^ 1 , n = mn would imply n = cn + n, 
which contradicts (i).] 

4 Show that J n gQ=^ +Jn e N + (n e N + ). 

[Hint. Suppose +Jn = a /b where a,b e N + . Then a 2 = b 2 n. Now taking prime 
decomposition, it follows that n is a perfect square =>► *Jn e N + .] 

5 Let x eR i.e., x be a real number. We assume that 3 integers which are <x. So 
the set S of all such integers is non-empty and is bounded above (x being an upper 
bound). Hence by Zorn’s Lemma there is a greatest integer in S which is less than 
or equal to x. This integer is called the integral part of x and is noted by [x]. Show 
that [x] satisfies the following properties: 

(i) [x] < x < [x] + 1; 

(ii) x - [x] € [0, 1); 

(iii) if x = n + r, where n eZ and r e [0, 1), then n = [x]; 

(iv) let m e Z and n e N + and m=nq+r where q , r e Z and 0 < r < n — 1 . Then 
q = [^1 and r = m — n[^]. 

[Hint. As | = q + r - and 0 < r - < ^ < 1. 

Hence by (iii), q = [^] and so r = m — n[^].] 

6 lfa = b (mod n), then gcd(a, n) = gcd(b, n) (a, b,n e N + ). 

[Hint. Let d\ = gcd(< 2 , n) and J 2 = gcd(Z?, n). Now a = b (mod ft) =^b = a + kn 
for some k e Z. Hence d\ \n and d\ \a =>► d\ \b (by Theorem 1.4.4) =>► d\ is a common 
factor of b, n d\ < ^ 2 - Similarly, d 2 <d\. Hence d\ = ^ 2 -] 

7 Let ft? and ft be non-zero integers. The least common multiple of m and ft, denoted 
by 1cm (ft?, ft), is defined to be the positive integer l such that in Z, 

(i) m\l and ft |/ and 

(ii) m\c and ft|c =>l\c. 
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Then lcm (m, ri) exists and is unique and 1cm (m, n) has the following relations 
with gcd(m, n)\ 

(a) gcd(m, n) • lcm(m, n) = \mn\; 

(b) 1cm (m, n) = \mn \ gcd(m, n ) = 1 . 

Supplementary Examples (SE-III) 

1 If n is odd, then n 2 is also odd. 

[Hint. Suppose n is odd and n = 2m + 1 for some me Z. Then n 2 = 4 (m 2 + 
m) + 1 is odd as 4(m 2 + m) is even.] 

2 For every integer n , neither n 2 = 2 (mod 5) nor n 2 = 3 (mod 5). 

[Hint. Let n be an arbitrary integer. Then on division n by 5, the only possible 
remainders r are 0, 1, 2, 3, 4 i.e., n = r (mod 5) for r = 0, 1, 2, 3 and 4 i.e., n 2 = 
r 2 (mod5) for r = 0, 1, 2, 3 and 4 i.e., n 2 =£ 2 (mod5) and n 2 ^3 (mod5).] 

3 There exists no integral solution of the equation lx 2 — \5y 2 = 1. 

[Hint. Suppose there exist integers n , m such that In 2 — 15 m 2 = 1. Then In 2 = 
1 (mod 5), since 15m 2 = 0 (mod 5). 

Again 


7 = 2 (mod 5) => In 2 = In 2 (mod 5) In 2 = 1 (mod 5). (i) 

By using the above Supplementary Example 2, Vn g Z, n 2 = 0 (mod 5) or n 2 = 
4 (mod 5) or /7 2 = 1 (mod 5) =>► 2/i 2 = 0 (mod 5) or In 2 = 8 (mod 5) = 3 (mod 5) 
or In 2 = 2 (mod 5) =>- a contradiction if (i).] 

4 Every positive integer n is congruent to the sum of its digits modulo 9. Hence 
deduce a test for divisibility by 9. 

[Hint. Suppose n = n r n r -\ • • • n 2 n\no (written in the customary notation). 

Then n = n r 10 r + /t r _i 10 r— 1 H b ft2l0 2 + ^ilO + no. => n = ( n r + n r -\ + 

b^2 +n\ +no) (mod 9) (since 10 m = 1 (mod 9) for all integers m > 0). Clearly, 

n is divisible by 9 iff the sum of its digits is divisible by 9.] 


1.5.1 Exercises 

Exercises-II 

1 . Prove for any positive integer n : 

(i) 3 3 "+i =3.5" (mod 11); 

(ii) 2 4n+3 = 8.5 n (mod 11); 

(iii) 3 3/1+1 + 2 4n+3 = Omod(ll); 

[Hint, (i) 3 3 = 5 (modll)^3 3n = 5' 7 mod(ll)^3 3n+1 =3.5 n (mod 11).] 
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2. Show that there is no solution in integers of the equation 

x 2 + / = 3z 2 


except x = y = z = 0. 

3. Let n be an integer such that n is a perfect square and n = rt where r and t are 
relatively prime positive integers. Show that r and t are perfect squares. 

[Hint. Consider prime decomposition of n. Note that an integer m > 2 is a 
perfect square every prime appearing in the decomposition of m is of even 
power.] 

4. Prove that any prime of the form 3n + 1 is of the form 6r + 1 . 

5. Prove that any positive integer of the form 3^ + 2 has a prime factor of the same 
form. 

6. Prove that gcd(n, n + 2) = 1 or 2 for every integer n. 

7. Prove that there are infinitely many primes of the form An + 3 or 6n + 5. 

8. Show that n > 1 is a prime integer iff for any integer r either gcd(r, n) = 1 ovn \r . 

9. Prove that there are arbitrarily large gaps in the series of primes. 

[Hint. See Chap. 10.] 


1.6 Additional Reading 

We refer the reader to the books (Adhikari and Adhikari 2003; Burton 1989; Halmos 
1974; Hungerford 1974; Jones and Jones 1998; Rosen 1993; Simmons 1963) for 
further details. 
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Chapter 2 

Groups: Introductory Concepts 


Groups serve as one of the fundamental building blocks for the subject called today 
modern algebra. This chapter gives an introduction to the group theory and closely 
related topics. The idea of group theory was used as early as 1770 by J.L. Lagrange 
(1736-1813). Around 1830, E. Galois (1811-1832) extended Lagrange’s work in 
the investigation of solutions of equations and introduced the term ‘group’. At that 
time, mathematicians worked with groups of transformations. These were sets of 
mappings that, under composition, possessed certain properties. Originally, group 
was a set of permutations (i.e., bijections) with the property that the combination of 
any two permutations again belongs to the set. Felix Klein (1849-1925) adopted the 
idea of groups to unify different areas of geometry. In 1870, L. Kronecker (1823— 
1891) gave a set of postulates for a group. Earlier definitions of groups were gen- 
eralized to the present concept of an abstract group in the first decade of the twen- 
tieth century, which was defined by a set of axioms. The theory of abstract groups 
plays an important role in the present day mathematics and science. Groups arise 
in a number of apparently unrelated disciplines. They appear in algebra, geome- 
try, analysis, topology, physics, chemistry, biology, economics, computer science 
etc. So the study of groups is essential and very interesting. In this chapter, we 
make an introductory study of groups with geometrical applications along with free 
abelian groups and structure theorem for finitely generated abelian groups. More- 
over, semigroups, homology groups, cohomology groups, topological groups, Lie 
groups, Hopf groups, and fundamental groups are discussed. 


2.1 Binary Operations 

Operations on the pairs of elements of a non-empty set arise in several contexts, 
such as the usual addition of two integers, usual addition of two residue classes in 
7u n , usual multiplication of two real or complex square matrices of the same order 
and similar others. In such cases, we speak of a binary operation. The concept of 
binary operations is essential in the study of modern algebra. It provides algebraic 
structures on non-empty sets. The concept of binary operations is now introduced. 
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Definition 2.1.1 Let S be a non-empty set. A binary operation on S is a function 
f:SxS^S. 

There are several commonly used notations for the image f(a, b) e S for a, b e 
S , such as ab or a • b (multiplicative notation), a + b (additive notation), a * b etc. We 
may therefore think the usual addition in Z, as being a function: Z x Z Z, where 
the image of every pair of integers (m, n) e Z x Z is denoted by m + n. Clearly, a 
binary operation *•’ on S is equivalent to the statement: if a, b e S, then a • b e 5 
(known as the closure axiom). In this event S is said to be closed under *■’. There 
may be several binary operations on a non-empty set S. For example, usual addition 
(a, b) \-^ a + b\ usual multiplication ( a,b) \-+ ab; (a, b)\-+ maximum {a, b}, min- 
imum {a, b], 1cm {a, b], gcd {a, b] for (a, b) e N + x N + are binary operations on 
N+ 

A binary operation on a set S is sometimes called a composition in S. 

Example 2.1.1 Two binary operations, called addition and multiplication are de- 
fined on 7u n (see Chap. 1) by (( a ), (b)) \- (a + b) and (( a ), (, b )) i-> (ab), respec- 
tively. Clearly, each of these operations is independent of the choice of representa- 
tives of the classes and hence is well defined. 

Example 2.1.2 Let S be any non-empty set. Then 

(i) both union and intersection of two subsets of S define binary operations on 
V(S), the power set of S ; 

(ii) the usual composition ‘o’ of two mappings of M(S), the set of all mappings 
of S into S , is a binary operation on M(S) i.e., (/, g) t-> fog, f,g e M(S), 
defines a binary operation on M(S). 

Clearly, / o g ^ g o /, /, g e M(S) for an arbitrary set S. 

For example, if S = {1, 2, 3} and f,ge M(S) are defined by 

f(s) = 1 VseS and g(s)=2 VscS, then (g o f)(s) = g(f(s)) 

= *(!) = 2 and (/ og)(s) = f{g(s)) = f(2) = 1, Vs e 5 
=>■ g°f¥=f°g- 

We may construct a table called the Cayley table as a convenient way of defining 
a binary operation only on a finite set S or tabulating the effect of a binary operation 
on S. Let S be a set with n distinct elements. To construct a table, the elements of 
S are arranged horizontally in a row, called initial row or 0-row: these are again 
arranged vertically in a column, called the initial column or 0-column. The element 
in the i th position in the 0-column determines a horizontal row across it, called the 
i th row of the table. 

Similarly, the element in the jth position in the 0-row determines a vertical col- 
umn below it, called the jth column of the table. The (/, j)th position in the table is 
determined by the intersection of the i th row and the jth column. 
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The (/, j)th positions (/, j = 1, 2, . . . , n) are filled up by elements of S in any 
manner. Accordingly, a binary operation on S is determined. 

For two elements u,v e S, where u occupies i th position in the 0-column and v 
occupies jth position in the 0-row, (u,v)\-^uov, where u o v is the element in the 
(/, j)th position, i, j = 1, 2, 3, . . . , n. 

There is also a reverse procedure: given a binary operation / on a finite set S , we 
can construct a table displaying the effect of /. For example, let S = {1, 2, 3} and / 
a binary operation on S defined by 

/( 1, 1) = 1, /(l, 2) = 1, /(l, 3) = 2, / (2, 1) = 2, / (2, 2) = 3, 

/ (2, 3) = 3, / (3, 1) = 1, / (3, 2) = 3, /( 3, 3) = 2. 

Then the corresponding table is 

/: 12 3 

1 n i 2 

2 2 3 3 

3 13 2 

A table of this kind is termed a multiplication table or composition table or Cayley ’s 
table on S. 

Definition 2.1.2 A groupoid is an ordered pair ( S , o), where S is a non-empty set 
and ‘o’ is a binary operation on S and a non-empty subset H of S is said to be a 
subgroupoid of ( S , o) iff ( H , o//) is a groupoid, where o# is the restriction of ‘o’ to 
H x H. 

Supplementary Examples (SE-IA) 

1 Let R be the set of all real numbers and R* = R — {0}. 

(i) Define a binary operation o on R* by v o y = \x \y . Then (x o y) o z = x o (y o z) 
for all x , y , z G R* but x o y ^ y o x for some x , y G R* . 

(ii) Define a binary operation o on R* by x o y = |x — y\. Then (x o y) o z / 

x o (y o z) for some x , y , z G R* but x o y = y o x for all x , y G R* . 

(iii) Define a binary operation o on R* by x o y = min{x, y}. Then (x o y) o z = 

x o (y o z) and x o y = y o x for all x, y, z G R*. 

2 Let Q* be the set of all non-zero rational numbers. Define a binary relation o on 
Q*byxo y =x/y (usual division). 

Then xoy^yox and (x o y) o z ^ x o (y o z) for some x, y, z G Q*. 

3 A homomorphism / from a groupoid (G, o) into a groupoid (H , *) is a mapping 
f : G ^ H such that /(x o y) = /(x) * /(y) for all x, y G G. A homomorphism / 
is said to be a monomorphism, epimorphism or isomorphism (=) according as / is 
injective, surjective or bijective. 
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(i) Isomorphism relation of groupoids is an equivalence relation. 

(ii) If G and H are finite groupoids such that \G\ / \H\. Then G cannot be iso- 
morphic to H. 

(iii) The groupoids (Z, o) and (Z, *) under the compositions defined by i o y = 
x + y + xy and x * y = x + y — xy are isomorphic. 

[Hint. The function / : (Z, o) -> (Z, *) defined by f(x) = — x Vx e Z is an 
isomorphism.] 

(iv) Let (C, +) be the groupoid of complex numbers under usual addition and / : 
(C, +) — > (C, +) be defined by f (a + ib) = a — ib. Then / is an isomorphism. 

(v) Let (N + , •) and (R, +) be groupoids under usual multiplication of positive 
integers and usual addition of real numbers. Then / : (N + , •) -> (R, +) defined 
by f(n) = log 10 n is a monomorphism but not an epimorphism. 

[Hint. 0 = log 10 1 < log 10 2 < log 10 3 < • • • => there is no integer n e N + 
such that log 10 n = — 1 e R / is not an epimorphism.] 


2.2 Semigroups 

Our main interest in this chapter is in groups. Semigroups and monoids are con- 
venient algebraic systems for stating some theorems on groups. Semigroups play 
an important role in algebra, analysis, topology, theoretical computer science, and 
in many other branches of science. Semigroup actions unify computer science with 
mathematics. 

Definition 2.2.1 A semigroup is an ordered pair (S, •), where S is non-empty set 
and the dot 7 is a binary operation on S , i.e., a mapping (a,b) i-> a • b from S x S 
to S such that for all a, b, c e S, (a • b) • c = a • (b • c) (, associative law). 

For convenience, (5, •) is abbreviated to S and a-b to ab. The associative prop- 
erty in a semigroup ensures that the two products a (be) and (ab)c are same, which 
can be denoted as abc. 

Definition 2.2.2 A semigroup S is said to be commutative iff ab = ba for all 
a,b e S', otherwise S is said to be non-commutative. 

Example 2.2.1 (i) (Z, +) and (Z, •) are commutative semigroups, where the binary 
operations on Z are usual addition and multiplication of integers. 

(ii) (V(S), (T) and (V(S), U) (cf. Example 2.1.2(i)) are commutative semigroups. 

(iii) (M(S), o) (cf. Example 2.1.2(h)) is a non-commutative semigroup. 

Definition 2.2.3 Let S be a semigroup. An element 1 e S is called a left (right) 
identity in S iff la = a (a 1 = a) for all a e S and 1 is called an identity iff it is a 
both left and right identity in S. 
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Proposition 2.2.1 If a semigroup S contains a left identity e and a right identity 
/, then e = / is an identity and there is no other identity. 

Proof ef = /, as e is a left identity and also ef = e, as / is a right identity. Conse- 
quently, e = / is an identity. Next let 1 e S is also an identity. Then 1 is both a left 
and a right identity. Consequently, e = 1 and f — 1 . □ 

This proposition shows that a semigroup has at most one identity element and we 
denote the identity element (if it exists) of a semigroup by 1 . 

Definition 2.2.4 Let S' be a semigroup. An element 0 e S is called a left (right) 
zero in S iff Oa = 0 (aO = 0) for all a e S and 0 is called a zero iff it is a both left 
and right zero in S. 

If an element a e S is such that a 0, then a is called a non-zero element of S. 

Proposition 2.2.2 If a semigroup S contains a left zero z and also a right zero w , 
then z = w is a zero and there is no other zero. 

Proof Similar to proof of Proposition 2.2.1. □ 

Definition 2.2.5 A semigroup S with the identity element is called a monoid. 

Example 2.2.2 (i) (Z, •) is a monoid under usual multiplication, where 1 is the 
identity. 

(ii) The set of even integers is not a monoid under usual multiplication. 

If a semigroup S has no identity element, it is easy to adjoin an extra element 1 to 
the set S. For this we extend the given binary operation 4 • ’ on S to S U { 1 } by defining 
1 • a = a • 1 = a for all a e S and 1-1 = 1. Then S U {1} becomes a semigroup with 
identity element 1 . 

Analogously, it is easy to adjoin an element 0 to S’ and extend the given binary 
operation on S to S U {0} by defining 0 • a = a • 0 = 0 for all a e S and 0-0 = 0. 
Then S U {0} becomes a semigroup with zero element 0. 

Definition 2.2.6 Let S be a semigroup with zero element 0. If for two non-zero 
elements a,b e S, ab = 0, then a and b are called respectively a left and a right 
divisor of zero. 

Example 2.2.3 Af 2 (Z), the set of 2 x 2 matrices over Z is semigroup (non- 

commutative) under usual multiplication of matrices with ( ^ ) as zero element. 

Clearly, (^) and (^) are non-zero elements of Af 2 (Z) such that ( 30 X 15 ) = 
(q q). As a result, ( ^ q ) is a left divisor of zero, having ( ^ ) the corresponding 
right divisor of zero. Again (~^)(^) = (^) shows that (^) is also a right 

divisor of zero, having a different matrix ( 4 ) the corresponding left divisor of 
zero. 


60 


2 Groups: Introductory Concepts 


If a i , <22, . . . , a n are elements of a semigroup S , we define the product <21*22 • • • a n 
inductively, as (a\ <22 • • • a n -\)a n . 

Theorem 2 . 2.1 Let S be a semigroup. Then all the products obtained by insertion 
of parentheses in the sequence a \ , <22, ... , a n ,a n +\ , a n +2 of elements of S are equal 
to the product a\a2 • • • a n a n +\a n + 2 - (Generalized associative law.) 

Proof By associative property in S, the theorem holds for n = 1 . Let the theorem 
hold for all values of n such that 1 <n<r and let p and p ’ be the product of 
an ordered set of 2 + r + 1 elements <21, 02, . . . , 02+r+i, f° r two different modes 
of association obtained by insertion of parentheses. Since the theorem holds for 
all values of n < r, the parentheses can all be deleted up to the penultimate stage. 
Then we have p = (<21*22 • • • a t ){a t ^\ • • • < 22 + r +i)> where 1 < t < 2 + r, and p ' = 
(<21 • • •<2;)(<2; + i • • -< 22 +r+i), where 1 < i < 2 + r. If i = t, then p = p' . Otherwise, 
we can assume without loss of generality that i >t. Then 

p = ((a\a2 ■ ■ ■ a,)(a t+ i ■■■a i ))(a i+ 1 •••a 2 +r+i) 

= (a 1 • ■■a t )((a t+ 1 • •■«,;)(«,; +i • • -a 2 +r+l))» by associativity in S 

= (0 1 •••a f )(a,+i ■■■a i a i+ i •••a 2 +r+i) = P- 

Thus = p' in either case, and the theorem holds for n = r + 1 also. Hence by 
the principle of mathematical induction the theorem holds for all positive integral 
values of n. □ 

Powers of an Element Let S be a semigroup and <2 e S. Then the powers of a 
are defined recursively: 

a l =a , a n+l =a n -a for n = 1 , 2 , 3 , . . . ; 

n is called the index of a n . Thus the powers of <2 are defined for all positive integral 
values of n . 

Proposition 2 . 2.3 Let S be a semigroup. Then for any element a e S, a r a r = a r+t 
and (a r Y = a rt , r, t are positive integers. 

Proof The proof follows from Theorem 2 . 2 . 1 . □ 

Definition 2 . 2.7 Let S be a semigroup with identity 1 and <2 e S. An element a’ e S 
is said to be a left inverse of a with respect to 1 iff a' a = 1 . Similarly, an element 
<2* e S is a right inverse of <2 iff <2*2* = 1 . An element, which is both a left and right 
inverse of an element <2, with respect to 1 , is called a two-sided inverse or an inverse 
of <2. 

Theorem 2 . 2.2 Let G be a semigroup with an identity element l. If a e G has an 
inverse , then it is unique. 
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Proof Let b and b' be inverses of a in G. Then 
b = bl 

= b(ab '), since b' is an inverse of a. 

= ( ba)b' (by associativity in G) 

= lb' , since ba = 1 

= b'. □ 

We denote by a~ l the unique inverse (if it exists) of an element a of a semigroup. 
The set M(X) of all mappings of a given non-empty set X into itself, where 
the binary operation is the usual composition ‘o’ of mappings, forms an important 
semigroup. 

Proposition 2.2.4 If X is a non-empty set , M(X) is a monoid. 

Proof Clearly, o) is a semigroup under usual composition of mappings. Let 

lx : X -> X be the identity map defined by 1x00 = x for all x e X. Then for any 
/ e M(X), (lx o /)(x) = 1 x(/W) = f{x) and (/ o l x )(x) = f (lx W) = /(*) 
for all v e X imply lx°/ = /olx = /- Consequently, M(X) is a monoid with 
lx as its identity element. □ 

Remark Every element in M(X) may not have an inverse. 

For example, if X = {1,2,3}, then / e M(X) defined by /( 1) = /( 2) = 
/( 3) = 1 has no inverse in M(X). Otherwise, there exists an element g e M(X) 
such that f og = go f = l x . Then 1 = lx(l) = (g o /)(1) = g(f( 1)) = g( 1) and 
2 = lx (2) = (g o /)(2) = g(/(2)) = g(l) imply that g(l) has two distinct values. 
This contradicts the assumption that g is a map. 

We now characterize the elements of M(X) which have inverses. 

Theorem 2.2.3 An element f in the monoid M(X) has an inverse iff f is a bijec- 
tion. 

Proof The proof follows from Corollary of Theorem 1.2.7 of Chap. 1 (by taking 
X = Y in particular). □ 

Let X be a non-empty set and B(X) denote the set of all binary relations on X. 
Define a binary operation ‘o’ on B{X) as follows: 

If p, a e B(J 0, then o o p = {(x, y) e X x X : if 3z e X such that (x, z) e 
p and (z, y ) e a}. 

Then B{X) is a semigroup with identity element A = {(x, x) : x e X}. 

Definition 2.2.8 A relation p on a semigroup S is called a congruence relation on 
S iff 
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(i) p is an equivalence relation on S ; 

(ii) (a,b) e p => { ca , cb) e p and (« ac , Z?c) e p for all 

Let p be a congruence relation on a semigroup S' and S/p the set of all p- 
equivalence classes ap,a e S. Define a binary operation ‘o’ on S/p by ap o bp = 
(< ab)p , for all a, Z? G S. Clearly, the operation ‘o’ is well defined. 

Then (S/p, o) becomes a semigroup. 

Definition 2.2.9 A mapping / from a semigroup S into a semigroup T is called a 
homomorphism iff for all a, b G S, 

f{ab) = f{a)f{b). 

Let S and T be semigroups. Then a homomorphism / : S -> T is called 

(i) an epimorphism iff / is surjective; 

(ii) a monomorphism iff / is injective; and 

(iii) an isomorphism iff / is bijective. 

If / : S -> T is an isomorphism, we say that S is isomorphic to T denoted S = T. 

Proposition 2.2.5 Let S be a semigroup and p a congruence relation on S. Then 
the map p : S ^ S/p defined by p{a) = ap for all a G S is a homomorphism from 
S onto S/p {p is called natural homomorphism). 

Proof Trivial. □ 

Definition 2.2.10 Let S and T be two semigroups and / : S — >► T a homomorphism 
from S onto T. The kernel of / denoted by ker / is defined by ker / = {(a,b) e 
SxS:f(a) = f(b)}. 

Clearly, p = ker / is a congruence relation on S. Then S/p is a semigroup. 

Proposition 2.2.6 Let f be a homomorphism from a semigroup S onto a semi- 
group T . Then the semigroup S / ker / is isomorphic to T . 

Proof Suppose p = ker/. Define g : S/p — > T by g(ap) = /(a) for all a e 5*. 
Clearly, g is an isomorphism. □ 

Definition 2.2.11 A non-empty subset I of a semigroup S is called a left {right) 
ideal of S iff xa e I {ax e /) for all v e S, for all a e I and I is called an ideal of 
S iff I is both a left ideal and a right ideal of S. 

Example 2.2.4 Let S' be a semigroup. Then 
(i) S is an ideal of S ; 
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(ii) for each a e S, Sa = {sa : s g S} is a left ideal and aS = {as : s g 5*} is a right 
ideal of S. 

Definition 2.2.12 A semigroup S is called left {right) simple iff S has no left (right) 
ideals other than S. 

Definition 2.2.13 Let S' be a semigroup. An element a e S is called an idempotent 
iff aa = a 2 = a. 

Clearly if 1 or 0 g S, they are idempotents. The set of idempotents of S is denoted 
by E(S). 

If a semigroup S contains 0, then an element c e S is said to be a nilpotent 
element of rank n for n > 2 iff c n = 0, but c n ~ l / 0 and hence (c n ~ 1 ) 2 = 0, since 
2 (n — 1) > n, so that c n ~ l is a nilpotent element of rank 2. Thus the following 
proposition is immediate. 

Proposition 2.2.7 If a semigroup S contains any nilpotent element , there also 

exists a nilpotent element of rank 2 in S. 

Definition 2.2.14 A semigroup S is called a band iff S = E(S ), i.e., iff every ele- 
ment of S is idempotent. S is called a right (left) group, iff S is right (left) simple 
and left (right) cancellative. 

Let S be a band. Define a relation *<’ on S by e < /, iff ef = fe = e. Then 
clearly, < is a partial order on S. A band S is said to be commutative, iff S is a 
commutative semigroup. 

Let X and Y be two sets and define a binary operation on S = X x Y as follows: 
(x, y)(z, t) = ( z,t ), where x, z e X and y,t e Y. Then S is a non-commutative 
band. We call S the rectangular band on X x Y. 

Definition 2.2.15 Let S be a semigroup. An element a e S is called regular iff 
a g a Sa, i.e., iff a = axa for some x e S. A semigroup S' is called regular iff every 
element of S is regular. 

Clearly, if axa = a, then e = ax and f = xa are idempotents in S'. 

Let a G S. An element b G S is said to be an inverse of a iff a/?a = a and bab = b. 
Clearly, if a is a regular element of S, then a has at least one inverse. 

Example 2.2.5 (i) Every right group is a regular semigroup. 

(ii) Every band is a regular semigroup. 

(iii) M(X) (cf. Proposition 2.2.4) is a regular semigroup under usual composition 
of mappings. 

(iv) Let M 2 (Q) be the set of all 2 x 2 matrices (° b d ) over Q, the set of rationals. 
Then M 2 (Q) is a regular semigroup with respect to the usual matrix multiplication. 
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Definition 2.2.16 A regular semigroup S is called an inverse semigroup iff for 
every element a of S there exists a unique element a~ l such aa~ l a = a and 
a~ l aa~ l = a~ l . 

Clearly, every inverse semigroup is a regular semigroup, but there are regular 
semigroups which are not inverse semigroups. For example, a rectangular band is a 
regular semigroup, but it is not an inverse semigroup. 

Let ( S , o) be a semigroup. A non-empty subset T of S is called a sub-semigroup 
of S iff Va, b e T, a o b e T i.e., iff ( T , o) is also a semigroup where the latter 
operation ‘o’ is the restriction of ‘o’ to T x T. 

For example, (N, •) is a sub-semigroup of (Z, •), where ‘-’ denotes the usual 
multiplication. Clearly, for any semigroup S , S is a sub-semigroup of itself. 

Example 2.2.6 Let S = M 2 (R) be the multiplicative semigroup of all 2 x 2 real 
matrices and T the subset of all matrices of the form (jJ),«eR. Then T is a 
sub- semigroup of S. 

S has an identity I = (^) and T has an identity I' = (* jj) but I ± I f . 

This example shows that a semigroup S with an identity 1 may have a sub- 
semigroup with an identity e such that e 1 . 

Proposition 2.2.8 If {A/} is any collection of sub-semigroups of a semigroup S , 
then P|- Ai is also a sub-semigroup of S, provided p|- A/ 7 ^ 0. 

Proof Let M = A z - ( 7 ^ 0 ) and a,b e M. Then a,b e Ai =>• ab e Ai for each sub- 
semigroup Ai => ab e M =>- M is a sub-semigroup of S', as associative property is 
hereditary. □ 

If X is a non-empty subset of a semigroup S, then (X), the sub-semigroup of S 
generated by X defined by the intersection of all sub- semigroups of S containing X , 
is the smallest sub-semigroup of S. We say that X generates S iff (X) = S. Clearly 
(S) = S. 

Let a e S, then ({a}) = (a) is called the cyclic sub- semigroup of S generated 
by a. A semigroup S is called cyclic iff there exists an element a e S such that 
S={a). 

Let S be an inverse semigroup and a, b e S. Then the relation ‘<’ defined by 
a < b iff there exists an idempotent e e S such that a = eb is a partial order relation. 

Let E(S ) be the set of all idempotents of an inverse semigroup S. Then E(S) 
is a sub-semigroup of S. Moreover, E(S) is a semilattice, called the semilattice of 
idempotents of S. 

Some Other Semigroups A semigroup S is called fully idempotent iff each ideal 
I of S is idempotent, i.e., iff I 1 — I. 

The following statements for a semigroup S are equivalent: 

(a) S is fully idempotent; 
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(b) for each pair of ideals, /, / of S, 70/ = //; 

(c) for each right ideal R and two-sided ideal /, R D I c 77?; 

(d) for each left ideal L and two-sided ideal /, L D I c LI. 

Definition 2.2.17 A function /(r) on a semigroup S' is said to be left (right) trans- 
lation on S iff l(r) is written as a left (right) operator and satisfies 

l(xy) = (lx)y((xy)r = x(yr)) Wx,yeS. 

The pair (/, r) is said to be linked iff x(ly) = (xr)y Vx,y G S and is then called a 
bitranslation on S. 

The set L(S) of all left translations (R(S) of all right translations) on S with the 
operation of multiplication defined by (ll\)x = l(l\x)(x(rr\) = (xr)r\) Vx e S is a 
semigroup. 

The set B(S) of all bitranslations on S with the operation defined by (/, r)(l \ , r\) = 
(III , rri) is a semigroup and it called the translational hull of S. 

Two bitranslations (/, r) and (l\ , r\) are said to be equal iff / = l\ and r =r\. 

For any s e S, the functions l s and r s given by l s x = sx , xr? = xs Vx e S are 
called respectively the fimer and inner right translation on S induced by s ; the 
pair T s = (l s , r s ) is called the inner hitranslation on S induced by s. 

The set T(S) of all inner bitranslations on S is called the inner part of £(£). 
If Li(S ) and Ri(S) are two sets of all inner left and inner right translations on S 
respectively, then T ( S ) C B(S), Lt(S) C L(S), and Ri(S) C R(S). 

Lemma 2.2.1 If l e L(S),r e R(S),l s e L t (S),r s e Ri(S),T s e T(S),w = 
(l,r) g B(S ), then for every s e S, 

(i) ll s = llsJsl = hr* 

(ii) r s r = r sr ,rr s = r is and 

(iii) wT s = Ti s ,T s w = T sr . 

Proof (i) (ll s )x =l(l s x) = l(sx) = (ls)x = f s x V* e S => ll s = h s e Li(S). 

Again (l s l)x = l s (lx) = s(lx) = (^r)^ = l sr x Vv € S =y l s l = l sr e Li(S). 

(ii) x(r s r) = (xr s )r = (xs)r =x(sr) = xr sr Vv G 5* => r s r = r 5r g ^(5'). 
Similarly, rr 5 = r/ 5 g R[(S). 

By using (i) and (ii) it follows that 

(iii) wT s = (l,r)(l s ,r s ) = (ll s ,rr s ) = (li s ,n s ) = T is G T(S). 

Similarly, T s w = T sr g T (S). □ 

Corollary 1 For a semigroup S , 

(i) L^S) A ideal of L(S)\ 

(ii) Ri(S) is an ideal of R(S)\ and 

(iii) T (S) is an ideal of B(S). 

Corollary 2 For a semigroup S , 

(i) Li(S) is a sub-semigroup of L(S)\ 
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(ii) Ri(S) is a sub-semigroup of R(S); and 

(iii) T(S ) is a sub- semigroup of B(S). 

Definition 2.2.18 A semigroup S is said to be left (right) reductive , iff xa = xb 
(ax = bx) Vx g S => a = b. If S is both left and right reductive, then S is said to 
be reductive. If xa = xb and ax = bx Vx e S => a = b, then S is said to be weakly 
reductive. 

Clearly, a reductive semigroup is also weakly reductive. 

For the application of category theory we follow the terminology of Appendix B. 
It is easy to check that semigroups and their homomorphisms form a category de- 
noted by Sq . 

We note that for a semigroup S e S G ,T(S) is also a semigroup e So- For / : 
S -+ W in S G , define /* = T(f) : T(S ) -> T(W) by /*((/„ r s )) = (l m , r m ). 
Then for g : W -> X in S G , 0 gf )* : T(S) -> T (X) in S G is such that 

(gfUQs,r s )) = (l(gf)(s),r(gf)( s )) = Qg(f(s)),r g (f(s ))) = g*(Qm,rf( s ))) 

= «.(/*<&, r,))) V(/„r,)er(S) =► (£/)* = £*/*. 

Also for the identity homomorphism /y : 5 — > 5 in 5 g, /y* : T(5) — > T ( S ) in is 
such that /$*(&, ^)) = (Z/^), r/ s(iS )) = (l s ,r s ) V(l s ,r s ) eT(S ) =>► /y* is the iden- 
tity homomorphism. The following theorem is immediate from the above discus- 
sions. 

Theorem 2.2.4 T : S G S G is a covariant functor (see Appendix B). 

Lemma 2.2.2 For a semigroup S, the function f : S —> T(S ) defined by f(s ) = 
T s Vs e S is a homomorphism from S onto T (S). 

Proof f(st) = T st — T s T t — f(s)f(t) Vs, t e S => f is a homomorphism. More- 
over, for any T s eT ( S ), we can find s e S such that f(s) = T s . □ 

Theorem 2.2.5 The function f : S T(S ) defined by f(s) = T s is an isomor- 

phism iff S is reductive. 

Proof Suppose S is reductive and f(s) = f(t) for some s,t e S. Then T s = T t => 
(l s ,r s ) = (l t ,r t )^l s =l t and r s = r t => sx = tx and xs = xt Vx e S ^ s = t ^ f 
is injective. Hence by Lemma 2.2.2, / is an isomorphism. 

Conversely, let / be an isomorphism and ax = bx, xa = xb hold Va,b e S and 
Vx e S. Then l a x = l b x and xr a = xr b hold Vx e S. This shows that l a = l b and 
r a = ^ yielding f(a) = T a = ( l a ,r a ) = (/*, r^) = T b = f(b). Hence a = b ^ S is 
reductive. □ 

Corollary If S is a cancellative semigroup, then the function f : S -> T (S) defined 
by f(s) = T s is an isomorphism. 
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For a given semigroup S e S G , the set Fs ( X ) of all semigroup homomorphisms 
from S to X in S G forms a semigroup under usual multiplication of functions. 

We also define for each homomorphism / : X Y in S G , the semigroup ho- 
momorphism /* = F s (f) : F S (X) -> F S (Y) by Mg) = f o g Vg : S ^ X in S G . 
Hence the following lemma is immediate 

Lemma 2.2.3 Fs : S G —> S G is a covariant functor. 

Let (Fs, T) denote the set of all natural transformations from the covariant func- 
tor Fs to the covariant functor T : S G — > S G (see Appendix B). 

Theorem 2.2.6 For each semigroup S e S G , the set (Fs, T) admits a semigroup 
structure such that (Fs, T) is isomorphic to T (S’). 

Proof Using the Yoneda Lemma (see Appendix B) for covariant functors Fs and T 
we find that there is a bijection fs :T(S) (Fs, T) for every S e S G . As T (S) is 
a semigroup, fs induces composition on the set (Fs, T) admitting it a semigroup 
structure such that (Fs, T) is isomorphic to T(S). □ 

Theorem 2.2.7 A semigroup S is isomorphic to the semigroup (Fs,T) iff S is 
reductive. 

Proof The theorem is immediate by using Theorems 2.2.5 and 2.2.6. □ 

Remark Theorem 2.2.6 yields a representation of the covariant functor T : S G 
S G . In this way the problem of classifying semigroups, i.e., of computing T ( S ) has 
been reduced to the determination of the set of natural transformations (Fs, T). 

Let S be a semigroup. T a sub-semigroup of S and w e B(T). Then a bitransla- 
tion w g B(S) is said to be an extension of w to S iff vox = vox and xvo = xwVx e T 
and we write w\T — w. 

Theorem 2.2.8 Let S be a semigroup , T a reductive semigroup , / : S —> T an 
epimorphism and let w e B(S). Then there exists a unique element w' e B(T) such 
that for each x e S, w' f(x) = f(vox) and f(x)w ’ = f(xw). 

Proof For t e T, 3s e S such that f(s) = t. Now define w't = f(ws) and tw' = 
f(xw). Then the theorem follows . □ 

Theorem 2.2.9 Let S he a reductive semigroup and l , r : S — >► S are such that the 
pair (l, r) is linked , then (l,r) e B(S). 

Proof It is sufficient to show that l e L(S) and r e R(S). Let x, y e S. Then for each 
t € S, t(lx)(y) = (t(lx))y = (tr)(x)y = (tr)(xy) = t(l(xy)). Since S is reductive, 
(lx)y = l(xy) => l e L(S). Similarly, reR(S). □ 
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2.2.1 Topological Semigroups 


Let S be a semigroup. If A and B are subsets of S , we define A# = {ab : a e 
A and b e B}. If a subgroup S' is a Hausdorff space such that the multiplication 
function (x, y) t-> xy is continuous with the product topology on S x S, then S is 
called a topological semigroup. The condition that the multiplication on S is contin- 
uous is equivalent to the condition that for each x , y e S and each open set W in S 
with xy eW, 3 open sets U and V such that x e U , y e V and UV CW. 

Note that any semigroup can be made into a topological semigroup by endowing 
it the discrete topology and hence a finite semigroup is a compact semigroup. 

Let C be the space of complex numbers with complex multiplication. Then C 
becomes a topological semigroup with a zero and an identity and no other idempo- 
tents. The unit disk D = {zeC:|z|<l}isa complex sub-semigroup of C. The 
circle 5* 1 = {z e C : |z| = 1} is also a sub-semigroup of C. 

Problems 

A Let A and B be subsets of a topological semigroup S. 

(a) If A and B are compact , then AB is compact. 

(b) If A and B are connected , then AB is connected. 

(c) If B is closed , then jxeL xA C B} is closed. 

(d) If B is closed , then {x e S : A C xB} is closed. 

(e) If B is compact , then {xe5: xA c Bx} is closed. 

if) If A is compact and B is open , then {x e S : xA C B} is open. 

(g) If A is compact and B is closed , then {x e S : x A H B / 0} is closed. 

Proof [See Carruth et al. (1983, 1986)]. □ 

B IfS is a topological semigroup which is a Hausdorff space, then the set E(S ) of 
all idempotents ofS is a closed subset of S. 

Proof E(S) is the set of fixed points of the continuous function / : S — >► S defined 
by x i-> x 2 . 

Define g : S — >► S x S by g(x) = (/(x), x) = (x 2 , x). Then g is continuous. Since 
the diagonal of a Hausdorff space is closed in 5x5,^- 1 (^(5)) = {xe5:x 2 = x} 
is closed. □ 

Semigroups of Continuous Self Maps Let S(X) denote the semigroup of all 
continuous maps from a topological space X into itself under composition of maps. 

At present there are at least four broad areas where active researches on S(X ) 
are going on. 

These are: 

(1) homomorphisms from S(X) into S(Y); 

(2) finitely generated sub- semigroups of S(X); 
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(3) Green’s relations and related topics for S(X ); 

(4) congruences on S (X). 

We list some problems: 

Problem 1 (a) If X and Y are homeomorphic spaces, then S(X) and 5(7) are iso- 
morphic semigroups. Is its converse true? 

[Hint. Let X be a discrete space with more than one element and Y be the same 
set, but endowed with the indiscrete topology. Then X and Y are not homeomorphic 
but S(X) and S(Y) are isomorphic under identity map.] 

(b) If X and Y are two compact O-dimensional metric spaces, then X and Y are 
homeomorphic the semigroups S(X) and S(Y) are isomorphic. 

Problem 2 Determine Green’s relations for elements of S(X) which are not neces- 
sarily regular. 

Remark Some work has been done in this direction but there is much yet to do. 
Problem 3 Determine all congruences on S(X). 

Remark If X is discrete, this was solved. For the case when X is not discrete [see 
Magil Jr. (1982)]. 

Problem 4 If X is Hausdorff and locally compact, then S (X) endowed with com- 
pact open topology is a topological semigroup. Is the converse true? 

Remark The converse is not true [see De Groot (1959)]. 


2.2.2 Fuzzy Ideals in a Semigroup 

Let S be a non-empty set. Consider the closed interval [0,1]. Any mapping from 
^ ^ [0, 1] is called a fuzzy subset of S. It is a generalization of the usual concept 
of a subset of S in the following sense. 

Let A be a subset of S. Define its characteristic function ka by 

ka(x) = 1 when x e A 
= 0 when x £ A. 

So for each subset A of S there corresponds a fuzzy subset ka • 

Conversely, suppose A : S -> [0, 1] is a mapping such that Im A = {0, 1}. 

Let A = {xeL A(x) = 1}. 

Now A is a subset of S and A is the characteristic function of A. 
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Definition 2.2.19 Let S be a semigroup. A fuzzy subset A (that is, a mapping from 
^ > [0, 1]) is called a left (right) fuzzy ideal iff 

A(xy) > A(y)(A(xy) > A(x)). 

Definition 2.2.20 Let A, B be two fuzzy subsets in a semigroup S. Define AL) B, 
A fi B , A o B by 

(A U i?)(x) = max(A(x), i?(x)), Vx e S; 

(A fi B)(x) = min(A(x), i?(x)), Vx £ S', 

(A o B)(x) = max{min(A(w), £(t;))}, ifx = uv, u,veS 

= 0, if x cannot be expressed in the form x = uv. 


2.2.3 Exercises 

Exercises-IA 

1. Verify that the following V are binary operations on the Euclidean plane R 2 : 
For (x, y), (x r , y r ) £ R 2 , 

(i) (x, y) * (x\ /) = (x, y) [if (x, y) = (x\ /)] = midpoint of the line joining 
the point (x, y) to the point (x', /) if (x,y) ^ (x', /); 

(ii) (x, y) * (x\ /) = (x + x', y + /), where + is the usual addition in R; 

(iii) (x, y) * (x\ y') = (x • x', y • /)> where 4 -’ is the usual multiplication in R. 

2. Verify that the following 4 o’ are binary operations on Q: 

(i) xoy=x — y — xy; 

(ii) xoy = *-±y+ 2 ; 

(iii) io)> = 

Determine which of the above binary operations are associative and commuta- 
tive. 

3. Determine the number different binary operations which can be defined on a set 
of three elements. 

4. Let S be the set of the following six mappings fi : R\{0, 1} — > R, defined by 

1 x - 1 1 

J [ ixi-+x; /2 :*!->■- ; h’-x\-+ ; f 4 :.vk ; 

1 — X X X 

fs : x i— > — — ; f 6 :x 

x — 1 

Verify that usual composition of mappings is a binary operation on S and con- 
struct the corresponding multiplicative table. 
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Exercises-IB 

1. A binary operation * is defined onZbyv*y=v + y — xy,x,yeZ. Show 
that (Z, *) is a semigroup. Let A = {x e Z; x < 1}. Show that {A, *} is a sub- 
semigroup of (Z, *)• 

2. Let A be a non-empty set and S = V(A) its power set. Show that (S, Pi) and 
(S', U) are semigroups. Do these semigroups have identities? If so, find them. 

3. Show that for every element a in a finite semigroup, there is a power of a which 
is idempotent. 

4. Let (Z, +) be the semigroup of integers under usual addition and (E, +) be the 
semigroup of even integers with integer 0 under usual addition. Then the map- 
ping / : Z — > E defined by z h* 2z Vz g Z is a homomorphism. Find a homo- 
morphism g : Z -> E such that g / /. 

5. Show that the following statements are equivalent: 

(i) S is an inverse semigroup; 

(ii) S is regular and its idempotents commute. 

6. Prove the following: 

(a) Let I be a left ideal of a semigroup S. Then its characteristic function kj 
is a fuzzy left ideal. Conversely, if for any subset I of S its characteristic 
function kj is a fuzzy left ideal, then I is a left ideal of S ; 

(b) If A , B are fuzzy ideals of a semigroup S , then AHB, AUB, AoB are fuzzy 
ideals of S. 

7. Let 5 1 be an inverse semigroup and E(S) the semilattice of idempotents of S. 
Prove that 

(i) (a" 1 ) -1 =a,VaeS 

(ii) e~ l = e, Ve e E(S) 

(iii) (ab)~ l =b~ l a~ l , Va,beS 

(iv) aea -1 e E(S ) Va e S ande e -EXS). 

8. Let ‘<’ be a partial order on S' such that (. S , o, <) is a partially ordered (p.o.) 
semigroup. Define the set P®(S) = {p € S : a < p o a Va e 5 1 } of all left positive 
elements of ( S , o, <) and correspondingly the set P r °(S) of all right positive 
ones, and P°(S) = P®(S) H P r °(5). Let E°(5) denote the set of all idempotents 
of (S, o). Prove the following: 

(i) If the p.o. semigroup (5, o, <) has a least element u , i.e., u < aVa e S, then 
w is idempotent, iff u = x o y holds for some x,y e S', 

(ii) If P°(S) = S and / e £:°(5), then x < f ^ x o f = f ^ f o x = f holds 
Vjc g S. 

(iii) If P°(5) = 5 holds and m is the least element of (S, <), then S\{«} is an 
ideal of (5, o), which is a maximal one. The converse holds if (S, <) is a 
fully ordered set, and if, for a < b =>► a o x = b for some x e S. 

9. A non-empty subset A of a semigroup S is called a generalized left semi-ideal 
(gls ideal), iff x 2 A c A for every 
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(a) Let Z* denote the set of all non-negative integers. Then Z* is a semigroup 
under usual multiplication. Show that A = {x e Z* : x > 6} is a gls ideal. 

(b) In an idempotent semigroup S , prove that A is a left ideal of S A is a gls 
ideal. 

(c) Let S be a semigroup prove that S is regular iff for each generalized left 
semi-ideal A of S and for each right ideal B of S, B H A = B*A, where 

B* = {b 2 : b & B}. 


2.3 Groups 

Historically, the concept of a group arose through the study of bijective mappings 
A (S’) on a non-empty set S. Any mathematical concept comes naturally in a very 
concrete form from specific sources. We start with two familiar algebraic systems: 
(Z, +) (under usual addition of integers) and (Q + , •) (under usual multiplication of 
positive rational numbers). We observe that the system (Z, +) possesses the follow- 
ing properties: 

1. ‘+’ is a binary operation on Z; 

2. ‘+’ is associative in Z; 

3. Z contains an additive identity element, i.e., there is a special element, namely 0, 
such that x + 0 = 0 + x= xfor every x in Z; 

4. Z contains additive inverses, i.e., to each igZ, there is an element (— x) in Z, 
called its negative, such that x + (— x) = (— x) + x = 0. 

We also observe that the other system (Q + , •) possesses similar properties: 

1 . multiplication ‘ • ’ is a binary operation on Q + ; 

2. multiplication is associative in Q + ; 

3. Q + contains a multiplicative identity element, i.e., there is a special element, 
namely 1, such that x • 1 = 1 • x = x for every x in Q + ; 

4. Q + contains multiplicative inverses, i.e., to each x e Q + , there is an element x -1 
which is the reciprocal of x in Q + , such that x-(x _1 ) = (x -1 )-x = l. 

If we consciously ignore the notation and terminology, the above four properties are 
identical in the algebraic systems (Z, +) and (Q + , •)• The concept of a group may 
be considered as a distillation of the common structural forms of (Z, +) and (Q + , •) 
and of many other similar algebraic systems. 

Remark The algebraic system (A (S'), o) of bijective mappings A(S) on a non- 
empty set S(under usual composition of mappings) satisfies all the above four prop- 
erties. This group of transformations is not the only system satisfying all the above 
properties. For example, the non-zero rationals, reals or complex numbers also sat- 
isfy above four properties under usual multiplication. So it is convenient to introduce 
an abstract concept of a group to include these and other examples. 
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The above properties lead to introduce the abstract concept of a group. We define 
a group as a monoid in which every element has an inverse. We repeat the definition 
in more detail. 

Definition 2.3.1 A group G is an ordered pair (G, •) consisting of a non-empty set 
G together with a binary operation V defined on G such that 

(i) if a, b, c e G, then (a • b) • c = a • (b • c) (associative law); 

(ii) there exists an element 1 e G such that 1 • a = a • 1 = a for all a e G (identity 
law); 

(iii) for each a e G, there exists a' e G such that a' • a = a • a' = 1 (inverse law). 

Clearly, the identity element 1 is unique by Proposition 2.2.1. The element a' 
called an inverse of a is unique by Theorem 2.2.2 and is denoted by a ~ l . 

A group G is said to be commutative iff its binary operation ‘-’is commutative: 
a • b = b • a, for all a, b e G, otherwise G is said to be non-commutative. 

The definition of a group may be re-written using additive notation: We write 
a + b for a • b, — a for a -1 , and 0 for 1; a + b, — a, and 0 are, respectively, called 
the sum of a and b , negative of a , additive identity or zero and (G, +) is called an 
additive group. 

The binary operation in a group need not be commutative. Sometimes a commu- 
tative group is called an Abelian group in honor of Niels Henrick Abel (1802-1829); 
one of the pioneers in the study of groups. 

Throughout this section a group G will denote an arbitrary multiplicative group 
with identity 1 (unless stated otherwise). 

Proposition 2.3.1 If G is a group , then 

(i) a g G and aa = a imply a = 1 ; 

(ii) for all a, b,c e G,ab = ac implies b = c and ba = ca implies b = c (left and 
right cancellation laws); 

(iii) for each a e G, (< = a\ 

(iv) fora, b e G, ( ab)~ l = b~ l a~ l ; 

(v) for a,b e G, each of the equations ax = b and ya — b has a unique solution 
in G. 

Proof (i) aa = a => a~ l (aa) = a~ l a =>► (a~ l a)a = 1 =>> a = 1. 

(ii) ab = ac ^ a~ l (ab) = a -1 (< 2 c) => ( a~ l a)b = (<2 -1 <2)c =>* 1& = lc =>► b = c. 
Similarly, ba = ca ^ b = c. 

(iii) aa~ l = = 1 => ( a~ l )~ l = a. 

(iv) (b- l a~ l )(ab) = b-\a~\ab)) = b~ l {{a~ l a)b) = b~\lb) = b~ l b = 1 
and ( ab)(b~ l a~ l ) = a(bb~ l )a~ l — ala~ l = aa~ l = 1 =>► (&Z?) -1 = 

(v) = < 2 -1 (< 2 v) = (< 2 -1 < 2 )jc = lx = x mid ba~ l = (ya)^ -1 = y{aa~ l ) — 
yl = y are solutions of the equations ax = b and ya = b , respectively. 

Uniqueness of the two solutions follow from the cancellation laws (ii). □ 
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Corollary Ifa\ , a, 2 , . . . , a n e G, (^i <22 ^3 • • • < 2 n ) 1 = 1 • • • a 2 1 a l 1 . 

Proposition 2.3.2 Let G be a semigroup. Then G is a group iff the following con- 
ditions hold : 

(i) there exists a left identity e in G ; 

(ii) for each a e G, there exists a left inverse a' e G of a with respect to e, so that 
a' a = e. 

Proof If G is a group, then conditions (i) and (ii) follow trivially. Next let a semi- 
group G satisfy conditions (i) and (ii). As e e G, G / 0. If <2 e G, then by using (ii) 
it follows that aa r = e. Thus a' — a~ l is an inverse of a with respect to e, where 
ae = a (a~ l a) = (< aa~ l )a = ea = a Va e G =>► e is an identity. Therefore G is a 
group. □ 

Proposition 2.3.3 Let G be a semigroup. Then G is a group iff for all a,b e G the 
equations ax =b and ya = b have solutions in G. 

Proof If G is a group, then by Proposition 2.3. l(v), the equations ax —b and ya — 
b have solutions in G. Conversely, let the equation ya = a have a solution e e G. 
Then ea = a. For any b e G, if t (depending on a and b) be a solution of the equation 
ax — b , then at = b. 

Now, eb = e(at) = (ea)t = at = b. 

Consequently, eb = b, V/? e G => e is a left identity in G. 

Next a left inverse of an element a e G is given by the solution ya = e and the 
solution belongs to G. Consequently, for each a e G, there exists a left inverse in 
G. As a result G is a group by Proposition 2.3.2. □ 

Corollary A semigroup G is a group iff aG = G and Ga = G for all a e G, where 
aG = {ax : x e G} and Ga = [xa : x e G}. 

Definition 2.3.2 The order of a group G is the cardinal number | G | and G is said 
to be finite (infinite) according as \G\ is finite (infinite). 

Example 2.3.1 (Z, +), (Q, +) and (R, +) are infinite abelian groups, where + 

denotes ordinary addition. They are called additive group of integers, additive group 
of rational numbers and additive group of real numbers, respectively. 

Example 23.2 (Q*, •), (R*. •) and (C*, •), (S 1 , •) (S 1 = {z e C : |z| = 1}) form 
groups under usual multiplication, where Q* , R* , C* and C denote respectively the 
set of all non-zero rational numbers, non-zero real numbers, non-zero complex num- 
bers and all complex numbers. They are called multiplicative groups of non-zero ra- 
tional, multiplicative group of non-zero reals and multiplicative group of non-zero 
complex numbers, respectively. (S 1 , •) is called circle group in C. 

We now present vast sources of interesting groups. 
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Example 2.3.3 (Permutation group) Let S be a non-empty set and A(S ) be the set 
of all bijective mappings from S onto itself. Let /, g e A(S). 


Define 


fg by (fg)(x) = f(g(x)), for all x e S. 


( 2 . 1 ) 


First we show that fg e A(S). Suppose x,y e S and (fg)(x) = ( fg)(y ). Then 
f(g(x)) = f(g(y)). Since / is injective, it follows that g(x) = g(y). Again this 
implies v = y as g is injective. Consequently, fg is an injective mapping. Now 
let x e S. Since / is surjective, there exists y e S such that f(y) = x. Again g is 
surjective. Hence g(z) = y for some z e S. Then ( fg)(z ) = f(g(z)) = f(y) = x. 
This implies that fg is surjective and hence fg e A(S). Since every element of 
A(S) has an inverse, we find that A (5) is a group under the composition defined by 
(2.1) This group is called the permutation group or the group of all permutations 
on S. 

Example 2.3.4 (Symmetric group) In Example 2.3.3, if S' contains only n (n > 1) 
elements, say S = I n = { 1,2,.. . n }, then the group A(S) is called the symmetric 
group S n on n elements. If / e A(S), then / can be described by listing the elements 
of I n on a row and the image of each element under / directly below it. According 
to this assumption we write 



Suppose now i\,fa, • • • ,ir ( r < n ) are r distinct elements of I n . If / e A(S) be 
such that / maps i\ \-^ fa, fa 13 , . . . , i r - 1 ^ i r , b ^ fa an ^ maps every other 
elements of I n onto itself, then / is also written as / = (i\fa • • • i r ). This is called 
an r -cycle. A 2-cycle is called a transposition. The permutations oq, 0 L 2 , • . . , ot t in 
S n are said to be disjoint iff for every i, l <i <t and every k in I n , oii(k) ^ k 
imply oij(k) = k for every j, 1 < j < t. A permutation a e S n is said to be even 
or odd according as a can be expressed as a product of even or odd number of 
transpositions. The set of all even permutations in S n forms a group, called the 
alternating group of degree n , denoted by A n . For n > 3, it is a normal subgroup 
of S n (see Ex. 14 of SE-II). 

The groups S n are very useful to the study of finite groups. As n increases the 
structure of S n becomes complicated. But we can work out the case n = 3 comfort- 
ably. The symmetric group S 3 has six elements and is the smallest group whose law 
of composition is not commutative. 

Because of importance of this group, we now describe this group as follows: 

For n = 3, 73 = {1, 2, 3} and A(S) = S 3 . This symmetric groups S 3 consists of 
exactly six elements: 


e : 1 1-> 1, filial, 

2 1 — > 2 2 1 — > 3 

3 1 — ^ 3 3 1 — ^ 1 


/2 : 1 3, 

2 i-> 1 

3h+2 


/ 3 :1^1, / 4 :lh>3, 

2 1 — > 3 2 1 — > 2 

3 1 — > 2 3h> 1 


fs : 1 ^ 2 

2 i-> 1 

3 1 — > 3. 
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We can describe these six mappings in the following way: 

f =(! 2 3 ) ~ ( /. = (: 3 l) = (1 2 3) ' 

A-(J 2 l 2 ) = ‘‘ 3 2) - A=(l 2 3 )=(2 3), 

*=(5 2 l) = “ 3) ’ /i = G 1 3 ) = (1 2 >- 


is the identity element. 

Now 

/./3=(‘ 2 3 ) = /3 and / 3 /, = (J 2 3 ) = / 4 . 

Hence /1/3 7^/3 f\. Consequently S3 is not a commutative group. 


Example 2.3.5 (General linear group) Let GL( 2, R) denote the set of all 2 x 2 
matrices (“ ^), where a,b,c,d are real numbers and ad — be ^ 0. Taking usual 
multiplication of matrices as the group operation, we can show that GL( 2, R) is a 
group where ( ^ ) is the identity element and the element 

/ d _~b_\ 

I ad— be ad— be 1 

l ~c a I 

\ ad— be ad— be / 

is the inverse of (^) in GL( 2, R). 

This is a non-commutative group, as (^),(^) G GL( 2, R) and (^^(yg) = 

(--)and(-)(-) = (-34). 

This group is called the general linear group of order 2 over the set of all real 
numbers R. 


Remark The group GL( 2, C) defined in a similar way is called general linear group 
of order 2 over C. In general, GL(n , R) ( GL(n , C)), the group of all invertible n x n 
real (complex) matrices is called general linear group of order n over R (C). 

The concept of congruence of integers (see Chap. 1) is essentially due to Gauss. 
This is one of the most important concepts in number theory. This suggests the 
concept of congruence relations on groups, which is important in modern algebra, 
because every congruence relation on a group produces a new group. 
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We now introduce the concept of congruence relation on a group. An equivalence 
relation p on a group G is said to be a congruence relation on G iff (a , b) e p 
implies ( ca , cb) e p and ( ac , Z?c) e p, for all c e G. Let a be an element of a group 
G and p be a congruence relation on G. Then the subset {x e G : (a, x) e p} is 
called a congruence class for the element a. This subset is denoted by (a). The 
following theorem lists some properties of congruence classes. 

Theorem 2.3.1 Let p be a congruence relation on a group G. Then 

(i) ae(fl); 

(ii) (a) = (b) ijf(a,b ) e p\ 

(iii) if a, b e G, then either ( a ) = ( b ) 6>r (a) Pi (b) = 0; 

(iv) 7jf(a) = (/?) <222 d (c) = (d), then (ac) = (bd) and (ca) = (db). 

Proof (i) Left as an exercise. 

(ii) Left as an exercise. 

(iii) Suppose (a) D (b) ^ 0. Let c e (a) Pi (b). Then (a, c) e p and (b, c) e p. Let 
x G (a). Then (a, x) e p. From the symmetric and transitive property of p and from 
(a, c)ep and (a, x) e p we find that (c, x) e p. Hence (b, c) e p and (c, i)ep 
imply that (/?, x) e p. As a result x e (b) and hence (a) c (/?). Similarly, we can 
show that (Z?) c (a). Consequently (( 2 ) = (/?). 

(iv) From the assumption, (a,b) e p and (c, d) e p. Now p is a congruence 

relation. Hence (ca, cb) e p and (cb, db) e p. The transitive property of p implies 
that (ca, db) e p. Hence (ca) = (db). Similarly (ac) = (bd). □ 

The following theorem will show how one can construct a new group from a 
given group G if a congruence relation p on G is given. 

Theorem 2.3.2 Let p be a congruence relation on a group G. If G/p is the set 
of all congruence classes for p on G, then G/p becomes a group under the binary 
operation given by (a)(b) = (ab). 

Proof Let (a), (b) e G/p. Suppose (a) = (c) and (b) = (d) (c,d e G). Then 
(a, c)ep and (b, d) e p. Hence from (iv) of Theorem 2.3.1, we find that (ab, cd) e 
p. This shows that (ab) = (cd). Therefore the operation defined by (a)(b) = (ab) is 
well defined. Now, let (a), (b) and (c) be three elements of G. Then ((a)(b))(c) = 
(ab)(c) = ((ab)c) = (a(bc)) = (a) (be) = (a)((b)(c)). Therefore G/p is a semi- 
group. We can show that the congruence class (1) containing the identity element 1 
of G is the identity element of G/p. If ( 0 ) e G/p, then a e G and hence a~ l e G 
implies (a~ l ) e G/p. Now (a~ l )(a) = (a~ l a) = (1) = (aa~ l ) = (a)(a~ l ). Hence 
the inverse of (a) exists in G/p. Consequently G/p is a group. □ 

Corollary 1 If G is a commutative group and p is a congruence on G, then G/p 
is a commutative group. 
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Example 2.3.6 (Additive group of integers modulo m) Let (Z, +) be the additive 
group of integers. Let m be a fixed positive integer. Define a relation p on Z by 

p = {(a, b) eZ x Z : a — b is divisible by m } . 

Clearly, p is a congruence relation. This relation is usually called congruence rela- 
tion modulo m. Two integers a and b are said to be congruent modulo m, written 
a = b (mod m), iff a — b is divisible by m. For each integer a , the congruence class 
to which a belongs is (a) = {x e Z : x = a (mod m)} = {a + km : k e ZJ. Let Zip 
denote the set of all congruence classes modulo m . 

From Theorem 2.3.2, we find that Z/p is a group where the group operation + is 
defined by 

(a) + (b) = (a + b). 

This group is a commutative group. In this group the identity element is the con- 
gruence class (0) and (—a) is the inverse of (a). Generally we denote this group 
by Z m and this group is said to be the additive group of integers modulo m. 
(0), (1 ),..., (m — 1) exhaust all the elements of Z m . Given an arbitrary integer a , 
the division algorithm implies that there exist unique integers q nd r such that 

a = mq + r, 0 <r <m. 

From the definition of congruence a = r (mod m), it follows that (a) = (r). 
Clearly, there are m possible remainders: 0,1,..., m — 1. Consequently, every 
integer belongs to one and only one of the m different congruence classes: 
(0), (1 ),..., (m — 1) and Z m consists of exactly these elements. 

Remark The above example shows that for every positive integer m, there exists an 
abelian group G such that \G\ —m. 

Note 1 Z m may be considered to be a group consisting of m integers 0, 1 , . . . , m — 1 
together with the binary operation * given by the rule: 

for t, s e Z, t * s = t + s, if t + s<m 

= r, if t + s > m, where r is the remainder 
when t + s is divided by m . 

One of the basic problems in group theory is to classify groups up to isomorphisms. 
For such a classification, we have to either construct an explicit expression for an 
isomorphism or we have to show that no such isomorphism exists. An isomorphism 
is a special homomorphism and it identifies groups for their classifications. The 
concept of a homomorphism is itself very important in modern algebra. 

In the context of group theory (like semigroup), the word homomorphism means 
a mapping from a group to another, which respects binary operations defined on the 
two groups: If / : G G' is a mapping between groups G and G\ then / is called 
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a homomorphism iff fiab) = f{a)f(b) for all a,b e G. The kernel of the homo- 
morphism / denoted by ker / is defined by ker / = [a e G : f(a) = Iq' = l 7 }- 

Remark Taking a = b = 1, the identity element in G, we find /( 1) = /(1)/(1), 
which shows that /( 1) is the identity element V of G' . Again taking b = a~ l , we 
find /(l) = /(a)/(a _1 ). This implies = [/(tf)] -1 . Thus a homomorphism 

of groups maps identity element into identity element and the inverse element into 
the inverse element. 

Note 2 If / : G -> G' is a homomorphism of groups, then 

f(a n ) = ( f(a )) n , a e G, n eZ (a n is defined in Definition 2.3.4). 

Proof It follows by induction on n that for n = 1, 2, 3, ... , 

/(O = (/(*»". 

Again /(a -1 ) = (/(^)) -1 shows that for n = —k, k— 1, 2, 3, ... , 

/(«") = /((« -1 )*) = (/(« _1 )f = (/(«))”* = (/(«))" 

Finally, /(a 0 ) = /(l) = 1' = (/(a)) 0 . □ 

Definition 2.3.3 A homomorphism / : G — ► G' between groups is called 

(i) an epimorphism iff / is surjective (i.e., onto); 

(ii) a monomorphism iff / is injective (i.e., 1-1); 

(iii) an isomorphism iff / is an epimorphism and a monomorphism; 

An isomorphism of a group onto itself is called an automorphism and a homo- 
morphism of a group G into itself is called an endomorphism. 

Remark Since two isomorphic groups have the identical properties, it is convenient 
to identify them to each other. They may be considered as replicas of each other. 

If / : G — ► H and g : H — >► K are homomorphisms of groups, then their usual 
composition go / : G — >► K defined by (g o f)(a) = g(f(a)), a e G, is again a ho- 
momorphism. Because if a, b e G, then (g o /)(aZ?) = g(/(a/?)) = g(f(a)f(b)) = 
g(f(a))g(f(b )) = (g o f)(a)(g o f)(b). 

Theorem 2.3.3 Let f : G — > G' foe a homomorphism of groups. Then 

(i) / is a monomorphism iffkzx f = {1}; 

(ii) / is an epimorphism iff\mf = G’\ 

(iii) / is an isomorphism iff there is a homomorphism g : G' —> G such that go f = 
Ig and f o g = I G g where Iq is the identity homomorphism of G. 
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Proof (i) Let / be a monomorphism. Then / is injective and hence ker / = {1}. 
Conversely, let ker / = {1}. Now if f(a) = /(/?), then f{ab~ l ) = f(a)f(b)~ 1 = 1 
shows that ab~ l e ker / = {1}, and hence a — b. Consequently, / is injective. 

(ii) and (iii) follow trivially. □ 

Theorem 2.3.4 Let G be a group and Aut G be the set of all automorphisms of G. 
Then Aut G is a group , called the automorphism group of G. 


Proof For f,ge Aut G, define / o g to be the usual composition. Clearly, / o g e 
Aut G. Then Aut G becomes a group with identity homomorphism Iq of G as iden- 
tity and f~ l as the inverse of / e Aut G. □ 

Example 2.3.7 (i) Let (R*, •) be the multiplicative group of non-zero reals and 
GL( 2, R) be the general linear group defined in Example 2.3.5. Consider the map 
/ : GL( 2, R) -* R* defined by /(A) = det A. Then for A, B e GL( 2, R), AB e 
GL(2,R) and f(AB) = det (AB) = detAdettf = f(A)f(B). This shows that / 
is a homomorphism. As 1 e R* is the identity element of (R*, •), ker / = {A e 
GL(2, R) : /(A) = det A = 1}. 

(ii) Let G be the group under binary operation on R 3 defined by (a,b,c) * 
(x, y, z) = (a + x, b + y, c + z + ay) and H be the group of matrices 


/! 

a 

°\ 


° 

1 

b ) 

: a, b, c e R 

\° 

0 

i/ 



under usual matrix multiplication. Then the map / : G -> H defined by 

( 1 a c\ 

0 1 b\ 

0 0 l) 

is an isomorphism. 

(iii) Let G be an abelian group and / : G — >► G be the map defined by f (a) = 
a~ l . Then / is an automorphism of G. This automorphism is different from the 
identity automorphism. 

(iv) Let (R, +) and (R + , •) denote the additive group of reals and multiplication 
group of positive reals, respectively. Then the map / : R — > R + defined by f(x) = 
e x is an isomorphism. 

(v) Let (Q, +) and (Q + , •) denote the additive group of rationals and multiplica- 
tive group of positive rationals, respectively. Then these groups cannot be isomor- 
phic. To prove this, let 3 an isomorphism / : (Q, +) (Q + , •). Then for 2 e Q + , 

3 a unique element igQ such that f(x) = 2. Now jc = | + | and | e Q show 
that 2 = f{x) = /( | + f ) = /(f )/(f ) = [/(f)] 2 - This cannot hold as /(§) is a 
positive rational. Hence / cannot be an isomorphism. 


In any group, the integral powers a n of a group element a play an important role 
to study finitely generated groups, in particular, cyclic groups. 
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Powers of an element have been defined in a semigroup, for positive integral 
indices. In a group, however, powers can be defined for all integral values of the 
index, i.e., positive, negative, and zero. 

Definition 2.3.4 Let a be an element of a group G. Define 

(i) < 2 ° = 1 and a 1 = a ; 

(ii) a n+l = a n a for any non-negative integer n\ 

If n — —m (m > 0) be a negative integer, then define a~ m = ( a m )~ l = (a~ l ) m ; 
a n is said to be the nth power of a. 

The exponents so defined have the following properties: 

a m a n =a m+n and (a m ) n =a mn . 

Definition 2.3.5 Let G be a group. An element a e G is said to be of finite order iff 
there exists a positive integer n such that a n = 1. If a is an element of finite order, 
then the smallest positive integer of the set {m e N + : a m = 1} (existence of the 
smallest positive integer is guaranteed by the well-ordering principle) is called the 
order or period of a (denoted by 0(a)). An element a e G is said to be of infinite 
order, iff there is no positive integer n such that a n = 1 . 

Example 2.3.8 Let G = {1,—1, /,—/}. Then G is a group under usual multiplica- 
tion of complex numbers. In the group, —1 is an element of order 2 and i is an 
element of order 4. 

Remark An element a of a group G is of infinite order iff a r ^ a 1 whenever r =fit . 
This is so because a r = a 1 O a r ~ f = 1 ; that is, a is of finite order, unless r — t . 

Theorem 2.3.5 Let G be a group and a e G such that 0(a) — t. Then 

(i) the elements 1 = a°,a, ... , a3~ x are all distinct ; 

(ii) a n = l,ifft\rr, 

(iii) a n = a m , iffn = m (modO; 

(iv) 0(a r ) = t/gcd(t, r), (r, m, n e Z). 

Proof (i) If for 0 <n<m<t,a m =a n , then a m ~ n — a m (a n )~ 1 = 1. This contra- 
dicts the property of the order of a , since o < m — n < t and 0(a) — t. 

(ii) If n = tm, then a n = a tm = (a t ) m = 1. Conversely, if a n = 1, taking n — 
tm-\-r , where 0 < r < t, we have 1 = a 11 = a tm+r = (a t ) m a r = a r . This implies r = 
0, since t is the smallest positive integer for which a 1 — \ and 0 < r < t. Therefore 
n = tm , and hence Ms a factor of n. 

(iii) a m = a lt <£> a n ~ m = 1 1 is a factor of n — m by (ii) n=m (modO- 

(iv) Let gcd(L r) = m and t/ gcd (t, r) =n. Then t = nm. 

Let r = ms, where gcd (n,s) = 1. Now (a r ) q = a rq = 1 t\rq by (ii) <£> 
nm\msq n\sq n\q, since gcd (n, s) = 1. Thus the smallest positive integer q 
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for which ( a r ) q = 1 holds is given by q = n. This implies n = t/ gcd(t, r) is the 
order of a r . □ 

Remark Any power a n is equal to one of the elements given in (i). 

Definition 2.3.6 A group G is called a 

(i) torsion group or a periodic group iff every element of G is of finite order; 

(ii) torsion-free group iff every element of G except the identity element is of infi- 
nite order; 

(iii) mixed group iff some elements of G are of infinite order and some excepting 1 
are of finite order. 

Example 2.3.9 (i) (Finite torsion group) Consider the multiplicative group G = 
{1, — 1, /, — /}. Then G is a torsion group. 

(ii) (Group of rationals modulo 1) Consider the additive group Q of all rational 
numbers. Let p = {(a , b) e Qx Q \ a — b e Z} . Then p is a congruence relation on Q 
and hence Q/p is a group under the composition ( a ) + (b) = (a + b). Now (^), n = 
1, 2, 3, ... , are distinct elements of Q/p. Then Q/p contains infinite number of 
elements. Let p/q be any rational number such that q > 0. Now q(p/q) = (p) = (0) 
shows that (p/q) is an element of finite order. Consequently, Q/p is a torsion group. 
This is an infinite abelian group, called the group of rationals modulo 1, denoted by 
Q/Z. 

Example 2.3.10 (Torsion-free group) The positive rational numbers form a multi- 
plicative group Q + under usual multiplication. The integer 1 is the identity element 
of this group and it is the only element of finite order. Consequently, Q + is a torsion- 
free group. 

Example 2.3.11 (Mixed group) The multiplicative group of non-zero complex 
numbers C* contains infinitely many elements of finite order, viz. every nth root 

of unity, for n = 1, 2, It also contains infinitely many elements re l ° , with r / 1, 

of infinite order. Consequently, C* is a mixed group. 


2.4 Subgroups and Cyclic Groups 

Arbitrary subsets of a group do not generally invite any attention. But subsets form- 
ing groups contained in larger groups create interest. For example, the group of even 
integers with 0, under usual addition is contained in the larger group of all integers 
and the group of positive rational numbers under usual multiplication is contained 
in the larger group of positive real numbers. Such examples suggest the concept of a 
subgroup, which is very important in the study of group theory. The cyclic subgroup 
is an important subgroup and is generated by an element g of a group G. It is the 
smallest subgroup of G which contains g. In this section we study subgroups and 
cyclic groups. 
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Let (G, o) be a group and H a non-empty subset of G. If for any two elements 
a,b e H, it is true that a o b e H , then we say that H is closed under this group 
operation of G. Suppose now H is closed under the group operation ‘o’ on G. 
Then we can define a binary operation o# : H x H H by a oh b = a o b for all 
a, b e H. This operation o# is said to be the restriction of ‘o’ to H x H. We explain 
this with the help of the following example. 

Example 2.4.1 Consider the additive group (R, +) of real numbers. If Q* is the 
set of all non-zero rational numbers, then Q* is a non-empty subset of R. Now for 
any two a,b e Q*, we find that ab (under usual product of rational numbers) is an 
element of Q*. But we cannot say that Q* is closed under the group operation ‘+’ 
on R. Consider now the set Z of integers. We can show easily that Z is closed under 
the group operation ‘+’ on R. 

Definition 2.4.1 A subset H of a group (G, o) is said to be a subgroup of the group 
G iff 

(i) 

(ii) H is closed under the group operation ‘o’ on G and 

(iii) (H, oh) is itself a group, where o# is the restriction of ‘o’ to H x H. 

Obviously, every subgroup becomes automatically a group. 

Remark (Q*. •) is not a subgroup of the additive group (R, +). But (Z, +) is a 
subgroup of the group (R, +). 

The following theorem makes it easier to verify that a particular subset H of 
a group G is actually a subgroup of G. In stating the theorem we again use the 
multiplicative notation. 

Theorem 2.4.1 Let G be a group and H a subset of G. Then H is a subgroup of 
G iff H ^ & and ab~ l e H for all a,b e H. 

Proof Let H be a subgroup of G and a, b e H. Then b~ l e H and hence ab~ l e H. 
Conversely, let the given conditions hold in H. Then aa~ l = 1 e H and 1 a~ l = 
a~ l e H for all a e H . Clearly, associative property holds in H (as it is hereditary). 
Finally, for all a, b e H, ab = a(b~ l )~ l e H , since b~ l e H. Consequently, H is a 
subgroup of G. □ 

Corollary A subset H of a group G is a subgroup of G iff H ^ 0 and ab e H , 
a~ l e H for all a,b e H. 

Proof It follows from Theorem 2.4. 1 . □ 

Using the above theorem we can prove the following: 

(i) G is a subgroup of the group G. 

(ii) H = {1} is a subgroup of the group G. 
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Hence we find that every group G has at least two subgroups: G and {1}. These 
are called trivial subgroups. Other subgroups of G (if they exist) are called proper 
subgroups of G. 

We now give an interesting example of a subgroup. 

Definition 2.4.2 The center of a group G, written Z(G) is the set of those ele- 
ments of G which commute with every element in G, that is, Z(G) = {a e G \ ab = 
ba for all b g G}. 

This set is extremely important in group theory. In an abelian group G, Z(G) = 
G. The center is in fact a subgroup of G. This follows from the following theorem. 

Theorem 2.4.2 The center Z(G) of a group G is a subgroup ofG. 

Proof Since lb = b = bl for all b g G, 1 g Z(G). Hence Z(G) / 0. Let a, b g 
Z(G). Then for all v g G, ax = xa and bx = xb. Hence ax = xa and vZ? _1 = 
b~ l x. Now (aZ? _1 )v = a(b~ l x) = a(vZ? -1 ) = (< 2 v)Z? _1 = (v< 2 )Z? _1 = v(aZ? _1 ) for 
all v g G. Hence aZ? _1 g Z(G). Consequently Z(G) is a subgroup of G. □ 

We now introduce the concepts of conjugacy class and conjugate subgroup. 

Let G be a group and a g G. An element b g G is said to be a conjugate of a 
in G iff g G, such that b = gag~ l . Then the relation p on G defined by p = 
{(a, Z?) G G x G : b is a conjugate of a} is an equivalence relation, called conjugacy 
on G; the equivalence class (a) of the relation p is called a conjugacy class of a in 
G. Two subgroups FF and K of G are said to be conjugate subgroups iff there exists 
some g in G such that K = gHg~ l . Clearly, conjugacy is an equivalence relation 
on the collection of subgroups of G. The equivalence class of the subgroup FF is 
called the conjugacy class or the conjugate class of FF. 

Theorem 2.4.3 Let H be a subgroup of a group G and g G G. Then gHg~ l is a 
subgroup of G such that FF = gHg~ l . 

Proof Since 1 = g\g~ l G gHg~ l , gHg~ l ± 0. For gh\g~ x , gh 2 g~ l e gHg~ l , 
we have (ghig~ l )(gh 2 g~ l )~ l = gh\g~ l ghf 1 g~ x = gh\hf l g~ l e gHg~ l . Hence 
gHg~ l is a subgroup of G. Consider the map / : H gHg~ l defined by f(h) = 
ghg~ l VZz g H. For h\,h 2 € H, h\ = h 2 => gh\g~ l = gh 2 g~ l =>- / is well de- 
fined. Also for a e gHg~ l , there exists h = g~ l ag e H such that f(h) = ghg~ l = 
gg~ l agg~ l = a. Moreover, f(h\) = f(h 2 ) =>► gZzig -1 = gh 2 g~ l =>► hi = Zz 2 . Fi- 
nally, / (h\h 2 ) = g(ZziZz 2 )g -1 = (^lg _1 )(^ 2 g _1 ) = /(Al)/(*2) VZzi,Zz 2 G //. 
Consequently, / is an isomorphism. □ 

Let Ff and Ff be two subgroups of a group G. Now both subgroups /F and K 
contain the identity 1 of G. Hence H D K / 0. Let a,b e H P\ K. Then a,b e H 
and also a,b e K. Since // and Ff are both subgroups, aZ? -1 g FF and ab~ l g Ff . 
Hence aZ? -1 g FF Pi Ff for all a, b g FF H Ff . Consequently, FF Pi Ff in a subgroup 
of G. The more general theorem follows: 
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Theorem 2.4.4 Let {Hi : i el} be a family of subgroups of a group G. Then 
P|- G/ Hi is a subgroup of G. 


Proof Trivial. 


□ 


Theorem 2.4.5 Let f : G — > K be a homomorphism of groups. Then 

(i) Im f — {f(a) : a e G} is a subgroup of K; 

(ii) ker / = {a e G : f(a) = Ik} is a subgroup of G, where 1 k denotes the identity 
element of K . 


Proof It follows by using Theorem 2 . 4 . 1 . 


□ 


Remark Let C be a given collection of groups. Define a relation on C by G\ ~ 
G2 iff 3 an isomorphism / : G\ G2. Then is an equivalence relation which 
partitions C into mutually disjoint classes of isomorphic groups. Two isomorphic 
groups are abstractly indistinguishable. The main problem of group theory is: given 
a group G, how is one to determine the above equivalence class containing G, i.e., to 
determine the class of all groups which are isomorphic to G? In other words, given 
two groups G\ and G2, the problem is to determine whether G\ is isomorphic to G2 
or not i.e., either to determine an isomorphism / : G\ — > G2 or to show that there 
does not exist such an isomorphism. We can solve such problems partially by using 
the concept of generator, which is important for a study of certain classes of groups, 
such as cyclic groups, finitely generated groups which have applications to number 
theory and homological algebra. 

Finitely Generated Groups and Cyclic Groups There are some groups G which 
are generated by a finite set of elements of G. Such groups are called finitely gener- 
ated and are very important in mathematics. A cyclic group is, in particular, gener- 
ated by a single element. 

Let G be a group and A be a non-empty subset of G. Consider the family C of all 
subgroups of G containing A. This is a non-empty family, because G eC. Now the 
intersection of all subgroups of this family is again a subgroup of G. This subgroup 
contains A and this subgroup is denoted by (A). 

Definition 2.4.3 If A is a non-empty subset of a group G, then (A) is called the 
subgroup generated by A in the group G. If (A) = G, then G is said to be generated 
by A, and if, A contains finite number of elements, then G is said to be finitely 
generated. 

If A contains a finite number of elements, say A = {a\, <22, • • • , tf w }, then we write 
{a\,a2, ...a n ) in place of ({a\, <22, ... , a n }). 

The particular case when A consists of a single element of the group is of im- 
mense interest. For example, for the circle S of radius 1 in the Euclidean plane, if r 
is a rotation through an angle 2 jt/n radians about the origin, then r n =roro---or 
(n times) is the identity. This example leads to the concept of cyclic groups. 
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Definition 2.4.4 A subgroup H of a group G is called a cyclic subgroup of G iff 
3 an element a e G such that H = (a) and G is said to be cyclic iff 3 an element 
a e G such that G = (a), a is called a generator of G. 

Theorem 2.4.6 Let G be a group and a e G. Then (a) = {a n : n e Zj. 

Proof Let T = [a n : n e Z}. Then for c, d e T, there exist r,t eZ such that c = a r 
and d = a t . Now cd _1 = a r (a r ) -1 = = a r ~ t e T => T is subgroup of G. 

Clearly, {a} c T . Let d be a subgroup of G such that {a} c //. Then a e H and 
hence a -1 e //. As a result a n e H for any n eZ and then T c. H. Hence T is the 
intersection of all subgroups of G containing {a}. This shows that (a) = T = {a n : 
neZ}. □ 

Remark If a group G is cyclic, then there exists an element a e G such that any 
i G G is a power of a , that is, x =a n for some n e Z. Then the mapping / : Z -> G 
defined by f(n) = a n is an epimorphism. Moreover, if ker / = {0}, / is a monomor- 
phism by Theorem 2.3. 3(i), and hence / is an isomorphism, and such G is called 
an infinite cyclic group or a free cyclic group. For example, Z is an infinite additive 
cyclic group with generator 1 . Clearly, — 1 is also a generator. 

Remark Every cyclic group is commutative. Its converse is not true. For example, 
the Klein four-group (see Ex. 22 of Exercises-III) is a finite commutative group 
which is not cyclic and (Q, +) is an infinite commutative group which is not finitely 
generated (see Ex. 1 of SE-I) and hence not cyclic. 

Theorem 2.4.7 An infinite cyclic group is torsion free. 

Proof Let G be an infinite cyclic group and 1 / agG. Then G = (g) for some 
g e G. Consequently, a = g f , where t is a non-zero integer. If a is of finite order 
r > 0, then g tr =a r = 1, a contradiction. Consequently, a is of infinite order. □ 

Let G be a group. If G contains a finite number of elements, then the group G 
is said to be a finite group , otherwise, the group G is called an infinite group. The 
number of elements of a finite group is called the order of the group. We write | G | 
or 0(G) to denote the order of a group G. 

Theorem 2.4.8 Let G be a group and a e G. Then the order of the cyclic subgroup 
{a) is equal to 0(a ) , when it is finite, and is infinite when 0(a) is infinite. 

Proof It follows from Theorem 2.3.5, and Remark of Definition 2.3.5. □ 

In particular, if gcd(L r) = 1 in Theorem 2.3.5(iv), then 0(a) = O (a r ) and there- 
fore each of a and a r generates the same group G. This proves the following theo- 


rem. 
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Theorem 2.4.9 If G is a cyclic group of order n , then there are fin) distinct el- 
ements in G, each of which generates G, where fin) is the number of positive 
integers less than and relatively prime to n. 

The function fin) is called the Euler f -function in honor of Leonhard Euler 
(1707-1783). 

For example, for Z 14 , 0(14) = 6 and Z 14 can be generated by each of the ele- 
ments (1), (3), (5), (9), (11) and (13), since r(l) is a generator iff gcd (r, 14) = 1, 
where 0 < r < 14. 

Remark If a, b are relatively prime positive integers then f iab) = f ia)f ib). More- 
over, for any positive prime integer p , fip) = p — 1 , fip n ) = p n — p n ~ l for all 
integers n > 1 and if n — p n ^ p n ^ . . . p ™ r , then fin) = Pi'^ipi — 1 )p r ^~ X ipi ~ 
1 )...pr r ~\p r - 1 ) (see Chap. 10 ). 

Theorem 2.4.10 Every non-trivial subgroup of the additive group Z is cyclic. 

Proof Let H be a subgroup of Z, and H ^ {0}. Then H contains a smallest positive 
integer a (say). To prove the theorem it is sufficient to show that any integer x e H is 
of the form na for some n e Z. If v = na + r where 0 <r < a, then r = x —na e H 
as H is a group. Consequently, r = 0, so x = na. □ 

Theorem 2.4.11 If G is a cyclic group , then every subgroup of G is cyclic. 

Proof If G is infinite cyclic, then by definition G = Z. Then the proof follows from 
Theorem 2.4.10. Next suppose that G is a finite cyclic group generated by a and H a 
subgroup of G. Let t be the smallest positive integer such that a t e H. We claim that 
a 1 is a generator of H . Let a m e H . If m = tq + r, 0 < r < t, then a r = a m ~ tq = 
a m ia t )~ q e H , since a m ,a l e H. Since 0 < r < t, it follows by the property of t 
that r = 0. Therefore m = tq and hence a m = ia f ) q . Thus every element a m e H is 
some power of a 1 . Again, since a 1 e H , every power of a 1 is in H. Hence H — {a 1 ). 
Thus H is cyclic. □ 

The following theorem gives a characterization of the cyclic groups. 

Theorem 2.4.12 Let (G, •) be a cyclic group. Then 

(i) (G, •) is isomorphic to (Z, +) iff G is infinite ; 

(ii) (G, •) is isomorphic to (Z n , +) iff G is finite and \ G\ =n. 

Proof (i) Let (G, •) be an infinite cyclic group. Then the group (G, •) is isomorphic 
to (Z, +). Conversely, suppose (G, •) is isomorphic to (Z, +). In this case, since 
(Z, +) is infinite, (G, •) is also infinite, as an isomorphism is a bijection. 

(ii) Suppose (G, •) is a finite cyclic group. Then the case (i) cannot arise and 
therefore the case (ii) must occur. In this case (G, •) is isomorphic to (Z w , +), where 
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n = \G\. Conversely, suppose (G, •) is isomorphic to (Z w , +). Then \Z n \ —n^ 
\G\ = n => G is finite. □ 


Remark Let G and K be finite cyclic groups such that \G \ = m ^ n = \ K\ . Then G 
and K cannot be isomorphic, i.e., there cannot exist an isomorphism / : G — > K. 

Theorem 2.4.13 Let G be a cyclic group of order n and H a cyclic group of order 
m such that gcd (m, n) = 1 . Then G x H is a cyclic group of order mn. Moreover ; if 
a is a generator of G and b a generator of H, then (a, b) is a generator of G x H . 


Proof Clearly, G x H is a group under pointwise multiplication defined by 
(, g , h)(g', h') = (gg', hh') for all g, g' in G and h, h' in H . If e\ and e 2 be identity 
elements of G and H respectively, then a n = e\ and 1 < r < n =>> a r ^ e\ 9 b m = ei 
and 1 < r < m =>► b r / e 2 . Let x = (a,b) e G x H. Then v mn = (a mn ,b mn ) by 
definition of multiplication (i.e., pointwise multiplication) in 

G x H => v m/z = (^ 1 , ^ 2 ) order x < m/i (2.2) 


Again 


= (e u e 2 ) 


(aW) = (ei, e 2 ) 
a * = e\ and b 1 = e 2 
n\t and m\t 
mn\t (since gcd(m, n) = l) 
t > mn 

order x > mn (2.3) 


Consequently, (2.2) and (2.3) =>► order x = order (a,b) = mn. Since \G x H\ = 
mn and the order of the element (a,b) = mn , it follows that (a, b) generates G x H 
i.e., G x H is a cyclic group of order mn. □ 


Remark The converse of the last part of Theorem 2.4.13 is also true. Suppose ( u , t;) 
is a generator of G x H. Let a be a fixed generator of G and b a fixed generator 
of H . Then since ((u,v)) = G x H , 3 positive integers n and r such that ( u , n) n = 
(a, e 2 ) e G x H and (w, t;) r = (e\, b) e G x H. Then u n = a and v r = b => G = 
(a) c (u) c G and // = (/?) c (u) c (m) = G and (u) = w is a generator 
of G and t; is a generator of H . 


Problems 


Problem 1 Let G (^{1}) be a group which has no other subgroup except G and 
{1}. Show that G is isomorphic to for some prime p. 

Solution Let a € G — {1} (since G 7 ^ {1}). If T = {a), then T = G (since a € T and 
a/l)=^Gisa cyclic group generated by a. If possible, let G be infinite cyclic. 
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Then G = {a n : n g Z}, a m a n fox n^fm and a° = 1. Consider H = {ft 2n : ft g Z}. 
Then H is proper subgroup of G (as a g H) and H / {1}. This contradicts the fact 
that only subgroups of G are G and {1}. So, G cannot be infinite cyclic. In other 
words, G must be finite cyclic. Let \G\ = ft. If possible, let n = rs, 1 < r < ft, 
1 < s < ft. Since r||G| and G is cyclic, G has a cyclic subgroup // of order r. 
This contradicts the fact that only subgroups of G are G and {1}. So, n cannot be 
composite, i.e., n is some prime p , since G / {1}. Thus (G, •) is a cyclic group of 
order p and hence isomorphic to (Z p , +) by Theorem 2.4.12(ii). 

Problem 2 Let G be a group which is isomorphic to every proper subgroup H of 
G with H ^ { 1} and assume that there exists at least one such proper subgroup 
H ± {1}. Show that G is infinite cyclic. Conversely, if G is an infinite cyclic group, 
show that G is isomorphic to every proper subgroup H of G such that H / {1}. 

Solution There exists an element a e G such that a / 1 . Let P be the cyclic sub- 
group generated by a. So, by hypothesis, P is isomorphic to G as P ^ {1}. Since P 
is cyclic, G must be also cyclic. Consequently, G is infinite. 

Conversely, let G be an infinite cyclic group and H a proper subgroup of G such 
that H ^ { 1}. Then H = (a r )(r > 1), where ft is a generator of G. Consider a func- 
tion / : G — > H by /(ft) = ft r . Then f(a n ) = ft r/ \ Clearly, / is an isomorphism. 


2.5 Lagrange’s Theorem 

One of the most important invariants of a finite group is its order. While studying 
finite groups Lagrange established a relation between the order of a finite group and 
the order of its any subgroup. He proved that the order of any subgroup H of a finite 
group G divides the order of G. This is established by certain decompositions of 
the underlying set G into a family of subsets {aH : a g G} or {Ha : a e G}, called 
cosets of H. 

Let G be a group and H subgroup of G. If ft G G, then aH denotes the subset 
{ah e G \ h e H} and Ha denotes the subset {ha e G : h e H} of G. Now ft = ft 1 = 
lft shows that a e aH and a e Ha. The set aH is called the left coset of H in G 
containing a and Ha is called the right coset of H in G containing a. A subset A 
of G is called a left (right) coset of H in G iff A = aH ( =Ha ) for some a e G. 

Lemma 2.5.1 If H is a subgroup of a group G and a,b G G, then each of the 
following statements is true. 

(i) aH = H iff a G H\ 

(i) ’ Ha = H iff a G H ; 

(ii) aH = bH iffa~ l b£H; 

(ii) ’ Ha = Hb iffab~ x e H; 

(iii) Either aH = bH or a H n bH = 0; 

(iii)’ Either Ha = Hb or Ha Pi Hb = 0; 
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(iv) G = \J {aeG) aH- 

(iv) ’ G = U(aeG ) Ha ; 

(v) There exists a hijection aH -> H for each a e G; 

(v) ’ 7/zere exists a hijection Ha — > H for each a e G; 

(vi) If C is the set of all left cosets ofH in G and IZ is the set of all right cosets of 
H in G, then there exists a hijection from C onto IZ. 

Proof (i)-(iv) follow trivially. 

(v) If x eaH , then there exists h e H such that x = ah. 

Suppose x = ah i = ah 2 (h i, h^ E //). Then /zi = a~ l {ah\) = a~ l (ah 2 ) = /* 2 - 
Hence for each x e aH there exists a unique h e H such that x = ah. Define / : 

aH 7/ by /(jc) = /z when x = ah e aH . Now one can show that / is a bijective 

mapping from aH onto if. 

(vi) Define / : C IZ by f (a H ) = Ha~ l . One can show that / is a well defined 

bijective mapping. □ 

Example 2.5.1 (i) Consider the group S 3 on the set 1 3 = {1,2, 3}. In S 3 , H = 

{e, (12)} is a subgroup. For this subgroup eH = {e, (12)}, (13 )H = {(13), (123)} 
and (23) H = {(23), (132)} are three distinct left cosets and S 3 = eH U (13) if U 
(23)//. Again He = {e, (12)}, H( 13) = {(13), (132)}, 7/(23) = {(23), (123)} are 
three distinct right cosets of H and 

S 3 = He U 7/(13) U 7/(23). 

Hence the number of left cosets of H is three and the number of right cosets of H 
is also three. But C^filZ. 

(ii) Consider the group Z 6 = {0, 1, 2, 3, 4, 5} (see Example 2.3.6). In Z 6 , H = 
{0, 3} is a subgroup. Using additive notation, the left cosets H = {0, 3}, 1 + H = 
{1, 4}, 2 + H = {2, 5} partition Z 6 . 

We now prove several interesting results about finite groups. 

One of the most important invariance of a finite group is its order. 

Let G be a finite group. Any subgroup H of G is also a finite group. Lagrange’s 
Theorem (Theorem 2.5.1) establishes a relation between the order of H and the 
order of G. It plays a very important role in the study of finite groups. 

Theorem 2.5.1 (Lagrange’s Theorem) The order of a subgroup of a finite group 
divides the order of the group. 

Proof Let G be a finite group of order n and H a subgroup of G. Let \H\ = m. 
Now 1 < m < n. We can write G = |J aeG aH. Since any two left cosets of H are 
either identical or disjoint, there exist elements a\, a 2 , . . . , a t (t <n) such that left 
cosets a\H, a 2 H, . . . , a t H are all distinct and G = a\H U a 2 H • • • U a t H . Hence 

\G\ = \a\H\ + 1^2 T/| H V\a t H \ (|^77| denotes the number of elements ina;/Z). 

Let H = {h\, h 2 , ... , h m ). Then aH = {ah\,ah 2 , . . . , ah m }. The cancellation 
property of a group implies that ah 1 , ah 2 , . . . , ah m are m distinct elements, so that 
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\H\ = \ciH\ ( \aH\ denotes the number of elements in aH). Therefore n = \G\ = 
\aiH\ + \a 2 H\ + --- + \a t H\ = \H\ + \H\ + • • • + \H\(t times) = t\H\ = tm. Hence 
m divides n, where n = \ G|. □ 

Corollary 1 The order of an element of a finite group divides the order of the 
group. 

Proof Let G be a finite group and a e G. Then 0(a) = \(a)\ by Theorem 2.4.8. 
Hence the corollary follows from Lagrange’s Theorem 2.5.1. □ 

Corollary 2 A group of prime order is cyclic. 

Proof Let G be a group of prime order p and 1/aeG. Then O (a) is a factor of p 
by Corollary 1. As p is prime, either 0(a) = p or 0(a) = 1. Now a 0(a) = 

p => \ (a) \ = p = \G\ => G is a cyclic group, since (a) c G. □ 

Remark The converse of the Lagrange Theorem asserts that if a positive integer m 
divides the order n of a finite group G, then G contains a subgroup of order m. But 
it is not true in general. For example, the alternating group A 4 of order 12 has no 
subgroup of order 6 (see Ex. 4 of SE-I). However, the following properties of special 
finite groups show that the partial converse is true. 

1. If G is a finite abelian group, then corresponding to every positive divisor m of 
n, there exists a subgroup of order m (see Corollary 2 of Cauchy’s Theorem in 
Chap. 3). 

2. If m = p, a prime integer, then G has a subgroup of order p (see Cauchy’s 
Theorem in Chap. 3). 

3. If m is a power of a prime p , then G has a subgroup of order m (see Sylow’s 
First Theorem in Chap. 3). 

Let G be a group and H be a subgroup of G. We have proved that there exists a 
bijective mapping / : C — > IZ, where C is the set of all left cosets of H in G and 7 Z 
is the set of all right cosets of H in G. Then the cardinal number of C is the same 
as the cardinal number of 7 Z. This cardinal number is called the index of H in G 
and is denoted by [G : H]. If C is a finite set, then the cardinal number of C is the 
number of left cosets of H in G. Hence in this case the number of left cosets of H 
(equivalently the number of right cosets of H ) in G is the index of H in G. 

Theorem 2.5.2 If H is a subgroup of a finite group G, then 



Proof Let G be a finite group of order n. Then a subgroup H of G must also be of 
finite order, r (say). Every left coset of H contains exactly r distinct elements. If 
[G : H] = z, then n — ri , since the cosets are disjoint. This proves the theorem. □ 
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Given two finite subgroups H and K of a group G, HK = {M : h e H and k e 
K} may not be a subgroup of G. Hence \HK\ may not divide |G|. However, Theo- 
rem 2.5.3 determines \HK\. 


Theorem 2.5.3 If H and K are two finite subgroups of a group G, then 


\HK\ 


\H\\K\ 
\h n k\' 


Proof Let A = H fl K. Then A is a subgroup of G such that A c H and ACL 
Now A is a subgroup of the finite group H. Hence [H : A] is finite. Let [H : A] = 
t. Then there exist elements x\, X 2 , . . . , x t in H such that x\A, X 2 A, . . . , x t A are 
distinct left cosets of A in H and H = |J- =1 x;A. Now //AT = (IJ/=i x *^)^ = 
U \ = \XiK, since A c K. We claim that x\K, X 2 K, . . . , x t K are f distinct left cosets 
ofK. 

Suppose x/ = xy for some i and /, where 1 < i ^ j < t. Then xj 1 x z - g . 

But xC 1 ^ G Hence ^ = A. This shows that x* A = xj A; this is not 

true as xiA, . . . , x^A are distinct. As a result \HK\ = |xi^T| + |x 2 ^| H h |x^|. 

Also, by Lemma 2.5. l(v) it follows that |x z - AT | = \K\. Consequently, 


\HK\=t\K\ = [H:A]\K\ = 


\H\\K\ 

|A| 


\h nK\ m 


□ 


Corollary If H and K are finite subgroups of a group G such that H P\ K = { 1}, 
then \HK\ = \H\\K\. 


2.6 Normal Subgroups, Quotient Groups and Homomorphism 
Theorems 

In this section we make the study of normal subgroups and develop the theory of 
quotient groups with the help of homomorphisms. We show how to construct iso- 
morphic replicas of all homomorphic images of a specified abstract group. For this 
purpose we introduce the concept of normal subgroups. For some subgroups of a 
group, every left coset is a right coset and conversely, which implies that for some 
subgroups, the concepts of a left coset and a right coset coincide. Such subgroups 
are called normal subgroups. Galois first recognized that those subgroups for which 
left and right cosets coincide are a distinguished one. We shall now study those sub- 
groups H of a group G such that aH = Ha for all a e G. Such subgroups play 
an important role in determining both the structure of a group and the nature of the 
homomorphisms with domain G. 

Definition 2.6.1 Let H be a subgroup of a group G and a, b e G. Then a is said to 
be right (left) congruent to b modulo H , denoted by ap r b (mod H) (apib (mod H )) 
iff ab~ l cH (a~ l beH). 
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Theorem 2.6.1 Let H be a subgroup of a group H . Then the relation p r (p/) on G 
is an equivalence relation and the corresponding equivalence class of a G G is the 
right (left) coset Ha (aH). p r (p/) is called the right (left) congruence relation on 
G modulo H. 

Proof We write apb for ap r b (mod//) and then prove the theorem only for right 
congruence p and right cosets. 

aa~ l = 1 e H =>► apa Va e G. Again apb =>► ab~ l e H =>► (ab~ l )~ l e 
H =>► ba~ l G H =>► bpa. Finally, apb and bpc =>> G H and Z?c _1 g // =>► 
(<2Z? _ 1 )(Z?c _1 ) G // G // => ape. Consequently, p is an equivalence re- 

lation. Now for each a e G, ap = {x e G : xpa} = {x e G : xa~ l e H} = {x e 
G : = h for some h e H] = {x e G : x = ha, h e H} = {ha g G : h e H] = 

Ha. □ 

Theorem 2.6.2 If H is a subgroup of a group G, then the following conditions are 
equivalent. 

(i) Left congruence pi and right congruence p r on G modulo H coincide', 

(ii) every left coset of H in G is also a right coset of H in G; 

(iii) aH = Ha for all a G G; 

(iv) for all a G G, aHa~ l c H, where aHa~ l = [aha~ l : h G H}. 

Proof (i) (iii) pi = p r a pi = ap r Va e G O aH = Ha Wa e G by Theo- 
rem 2 . 6 . 1 . 

(iii) (iv) Suppose aH = Ha Va G G. Then for each h e H , ah = h'a for some 
h' G // . Consequently, aha~ l = h r G // V7z g // . This shows that aHa~ l C // for 
every a e G. Again let for every a e G, aHa~ l c //. Then // = a(a~ l Ha)a~ l c 
aHa~ l , since a~ l Ha c // (by hypothesis). Consequently, aHa~ l = H for every 
a e G. This shows that aH = Ha Va G G. 

(ii) (iii) Suppose aH = Hb fox a, b e G. Then a G Ha H Hb. Since two right 
cosets of // in G are either disjoint or equal, it follows that Ha = Hb and hence (ii) 
=>> (iii). Its converse is trivial. □ 

Definition 2.6.2 Let G be a group and H a subgroup of G. Then H is said to be 
a normal subgroup of G denoted by HAG iff H satisfies any one of the equivalent 
conditions of Theorem 2.6.2. 

Thus for a normal subgroup, we need not distinguish between the left and right 
cosets. 

In any group G, G and {1} are normal subgroups, called trivial normal subgroups. 
Every subgroup of a commutative group is a normal subgroup. 

Example 2.6.1 If S 3 is the symmetric group of order 6 , then H = {e, (123), (132)} 
is a subgroup of S 3 . Now for any a G H, aH = H = Ha. The elements (12), 
(23), and (13) £ H. Now (12)// = {(12), (23), (13)} = 7/(12). Similarly, (13)// = 
H( 13) and (23) H = H( 23). Consequently, // is a normal subgroup of S 3 . Clearly, 
K = {£, (12)} is not a normal subgroup of S 3 . 
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Theorem 2.6.3 Let A and B be two subgroups of a group G. 

(i) If A is a normal subgroup of a group G, then AB = BA is a subgroup of G. 

(ii) If A and B are normal subgroups of G, then AH B is a normal subgroup of G. 

(iii) If A and B are normal subgroups of G such that A n B = {1}, then ab = 
bay a e A and Wb e B. 

Proof (i) Let x = ab e AB. Now bA = Ab =y ab = ba\ for some a\ g A =y 
x = ba\ G BA =>- AB c BA. Similarly, BA c AB. Consequently, AB = BA. 
Again for b,b' g B and a, a' e A, ( ba)(b'a')~ l = ba(a~ l b ~ l ) = b(aa~ l )b ~ l = 
ba\b ~ l {a\ g A) = where a 2 e A, b\ e B. This implies 5 A = A# is a sub- 
group of G. 

(ii) Clearly, H = A D B is a normal subgroup of G, since if /z e // and g e G, 
then g/zg -1 belongs to both A and B and hence is in H. 

(iii) Consider the element x = aba~ l b~ l (a e A,b e B). Nowx = ( aba~ l )b~ l e 

(aBa~ l )b~ l c 5 and again x = a(ba~ l b~ l ) g a(Z?AZ? _1 ) c ^A = A. 

Hence r g AH5 = {1). Consequently, aba~ l b~ l = 1. This shows that aZ? = 
baWa e A and Wb e B. □ 

We now show how to make the set of cosets into a group, called a quotient group. 

Theorem 2.6.4 Let H be a normal subgroup of a group G and G/H the set of all 
cosets of H in G. Then G/H is a group. 

Proof Define multiplication of cosets by ( aH ) • ( bH ) = ( ab)H , Va,b G G. We 
show that the multiplication is well defined. Let aH — xH and bH — yH. Then 
a G aH and b G bH are such that a = xh\ and b = yh 2 for some h\,h 2 G //. Now, 
(xy) -1 (< 2 Z?) = y -1 x -1 < 2 Z? = y -1 x -1 x/ziy/z2 = y~ l h\yh 2 G H, since y _1 /^iy G // 
as // is a normal subgroup of G. Consequently, by Lemma 2.5.1, it follows that 
( 1 ab)H = ( xy)H =>► (a//) • (Z?H) = (x/f) • (yH). Then G/H is a group where the 
identity element is the coset H , and the inverse of a H is a~ l H. □ 

Corollary If H is a normal subgroup of a finite group G then \G/H\ = |G|/|H|. 

Proof It follows from Theorem 2.5.2. □ 

Definition 2.6.3 If H is a normal subgroup of a group G, then group G/H is called 
th e factor group (or quotient group) of G by H. 

Remark If G is an additive abelian group, the group operation on G/H is defined by 
(a + H) + (b + H) = (a + b) + H 

and the corresponding quotient group G/H is sometimes called difference group. 

We now show that the construction of quotient groups is closely related to homo- 
morphisms of groups and prove a natural series of theorems. 
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Theorem 2.6.5 If f : G T is a homomorphism of groups, then the kernel of f 
is a normal subgroup of G. Conversely , if H is a normal subgroup of G, then the 
map 

n:G^G/H, given by tt (g) = g H , g^G 
is an epimorphism with kernel H. 

Proof Clearly, 1 e ker /. So, ker / / 0. Let x,y e ker /. Then f(xy~ l ) = 
f(x)f(y~ l ) = /(*)(/O0) _1 = 1 • 1 = 1 Vx,y g ker / =>► ker/ is a subgroup 
of G by Theorem 2 . 4 . 1 . Again if g e G, then /(gvg -1 ) = /(g)/M(/(g)) -1 = 
/(g) • 1 • (/(g)) -1 = 1 => gxg~ l G ker / =>► ker / is a normal subgroup of G. 

The map n : G ^ G/H given by 7t(g) = gH is clearly surjective. Moreover, 
Tt(gg') = gg'H = gHg'H = 7t(g)7t(g') Vg, g' g G =>► 7T is an epimorphism, called 
the canonical epimorphism or projection or natural homomorphism. Now ker7T = 
(g g G : 7r(g) = H} = {geG:gH = H} = {geG:geH} = H. □ 

Remark Let G be a group and H a subgroup of G. Then H is a normal subgroup 
of G iff // is the kernel of some homomorphism. 

Theorem 2.6.6 (Fundamental Homomorphism Theorem for groups) Let f : G — ► 
H be a homomorphism of groups and N a normal subgroup of G such that N c 
ker/. Then there is a unique homomorphism f : G/N —> H such that f(gN) = 
/(g) Vg G G, Im/ = Im/ ker / = ker f/N. Moreover ; / A aft isomorphism 

iff f is an epimorphism and ker / = A. 

Proof If x G gN, then x = gn for some n e N and f(x) = /(gft) = f(g)f(n) = 
/(g) 1 = /(g), since by hypothesis A c ker/. This shows that the effect of / 
is the same on every element of the coset gA. Thus the map / : G/N — > H, 
gN i-> /(g) is well defined. Now /(gA • hN) = f(ghN) = f(gh) = f(g)f(h) = 
f(gN)f(hN ) Vg, h G G =>► / is a homomorphism. From the definition of /, it is 
clear that Im / = Im/. Moreover, gA g ker / /(g) = 1 g e ker/. Conse- 

quently, ker / = {gA : g g ker/} = ker //A. Since / is completely determined by 
/, / is unique. Clearly, / is a monomorphism iff ker / = ker f/N is a trivial sub- 
group of G/N . The latter holds iff ker / = A. Moreover, / is an epimorphism iff 
/ is an epimorphism. Consequently, / is an isomorphism iff / is an epimorphism 
and ker/ = A. □ 

Corollary 1 (First Isomorphism Theorem) If f : G ^ H is a homomorphism of 
groups , £/ieft / induces an isomorphism f : G/ker/ — >► Im/. 

Corollary 2 If G and H are groups and f : G ^ H is an epimorphism , then the 
group G / ker / and H are isomorphic. 

Remark The homomorphic images of a given abstract group G are the quotient 
groups G/N by its different normal subgroups A. 
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Proposition 2.6.1 IfG and H are groups and f : G — > H is a homomorphism of 
groups , N is a normal subgroup of G, and T is a normal subgroup of H such that 
/( N) c T, then there is a homomorphism f : G/N -> H/T . 

Proof Define / : G/N ^ H/T by f(gN) = f(g)T. □ 

Corollary If f : G —> H is a homomorphism of groups with ker / = N, then there 
is a monomorphism f : G/N —> H. 

Proof We have f(N) = {1} and {1} is a trivial normal subgroup of H. Then by 
Proposition 2.6.1, / induces a homomorphism / : G/N -> H given by f(gN) = 
f(g) Vg g G. This shows that / is a monomorphism. Moreover, if / : G — > if is 
an epimorphism, then / : G/ A/" — ► H is also an epimorphism. □ 

Remark These results also prove the First Isomorphism Theorem. 

Theorem 2.6.7 (Subgroup Isomorphism Theorem or the Second Isomorphism The- 
orem) Let N be a normal subgroup of a group G and H a subgroup of G. Then 
H (IN is a normal subgroup of H, HN is a subgroup ofG and H/H P\N = HN/N 
0 or NH/N ), where HN = {hn : h G H,n g N}. 

Proof Suppose x e H HN and h G H. Then h~ l xh g N as N is a normal subgroup 
of G. Moreover, h~ l xh g H as x g H. Consequently, h~ l xh eHdN^HPiN 
is a normal subgroup of //. Clearly, HN 0. Now x, y g =>► jc = /m and 
y = h'n' for some n, n' e N,h,h' e H => xy~ l = hn(h'n')~ l = _1 /z / _1 = 

/mi^ _1 (for some ni G N)= hh' ~ l h f n\h r _1 = h\ri 2 (where hh ' _1 = h\ g H and 
h'n\h r _1 = n 2 G TV as Af is a normal subgroup of G) G // A^ =>- HN is a subgroup 
of G. 

Define / : H — > HN/N by h i-> /zA/^. Then / is an epimorphism. By using First 
Isomorphism Theorem, if/ ker / = /(if) = HN/N. 

Now ker / = {v g if : /(jc) = A^} = {x : x G H andvA^ = N} = {x : x g 
if and v G TV} = H n A^. Consequently, H/H P\ N = HN/N. □ 

Theorem 2.6.8 (Factor of a Factor Theorem or Third Isomorphism Theorem) If 
both H and K are normal subgroups of a group G and K c H, then H/K is a 
normal subgroup ofG/K and G/H = ( G/K)/(H/K ). 

Proof Clearly, G/K, G/H and H/K are groups. 

Consider the map / : G/K — > G/H given by 

f(aK)=aH , VaeG. 

As a K = bK => a~ l b G K c H => aH = bH , the map / is well defined. 
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Again, 


f(aK • bK) = f(abK) = abH = aHbH 

= f(aK)f(bK), for all aK , bK g G/K 

shows that / is a homomorphism. 

Clearly, / is an epimorphism. We have 

ker / = {aK g G/K : f(aK) =aH = H } 

= {aK :ae H} = H/K. 

Consequently, by the First Isomorphism Theorem, (G/K)/(H/K) = G/H. □ 

Theorem 2.6.9 Let f : G —> K be an epimorphism of groups and S a subgroup 
of K. Then H = f~ l {S) is a subgroup of G such that ker / c H . Moreover ; if S is 
normal in K , then H is normal in G. Furthermore , if H\ 5 ker / is any subgroup 
of G such that f(H\) = S , then H\ = H . 

Proof It is clear that, as 1 g H , H / 0. For g,heH, f(gh~ l ) = f(g)f(h~ 1 ) = 
f(g)f(h)~ l e S ^ gh~ l e H =>► F7 is a subgroup of G. Now ker / = {g g G : 
/(g) = 1^ g 5} ker / c H. Next let S be normal in K, g g G and h e H. Then 
/(g% _1 ) = f(g)f(h)f(gr l eS^ ghg - 1 eH=>H is normal in G. 

Finally, let // 5 ker / be a subgroup of G such that f{H\) = S. We claim that 
Hi = H. Now h\eHi=> f{h x ) eS^ H X ^H. Again heH =► f(h) =s eS. 
Choose h\ g F/i such that /(/z i ) = s. Then /(/z/z) -1 ) = /(/z)/(/zi) — 1 = v.s' — 1 = 
1 k /z/z^ * g ker / c // =>• /z g F/i =>• F7 c // . Consequently, // = F/i . □ 

Corollary 1 (Correspondence Theorem) Iff.G^K is an epimorphism of 
groups , £/zczz £/zc assignment H /(//) defines an one-to-one correspondence be- 
tween the set S(G ) c/a// subgroups H of G which contain ker / azzd £/ze set S(K) 
of all subgroups ofK such that normal subgroups correspond to normal subgroups. 

Proof The assignment H i-> f(H) defines a map / : S(G) -> S(K). By Theo- 
rem 2.6.9 it follows that / is a bijection. Moreover, the last part also follows from 
this theorem. □ 

Corollary 2 If N is a normal subgroup of a group G, then every subgroup of G/N 
is of the form T / N , where T is a subgroup of G that contains N . 

Proof Apply Corollary 1 (Theorem 2.6.9) to the natural epimorphism tt : G 
G/N. □ 

By using the First Isomorphism Theorem, the structure of certain quotient groups 
is determined. 

Example 2.6.2 (i) Let G = GL(n , R) be the group of all non-singular n x n ma- 
trices over R under usual matrix multiplication and H the subset of all matrices 
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X g GL(n , R) such that det A = 1, i.e., H = {X e GL(n , R) : det A = 1}. Since 
the identity matrix I n e H, H ± 0. Let X,Y e H, so that detA = detT = 1. By 
using the properties of determinants, we have det(AT _1 ) = detA • det(Y -1 ) = 
detA • (1/ detA) = 1. Thus X,Y e H =>► XT -1 g // =>► // is a subgroup of G by 
Theorem 2.4.1. Moreover, H is a normal subgroup of G by Theorem 2.6.5, be- 
cause // is the kernel of the determinant function det : GL(n, R) — > R* given by 
X i-> detA, where R* is the multiplicative group of non-zero reals. As the image 
set of this homomorphism is R*, we have GL(n , R)/if = R*. 

Remark A similar result holds for GL(n, C) with complex entries. 

(ii) Let / : (R, +) -> (C*, •) be the homomorphism defined by f(x) = e 2nix = 
cos 2jxx + i sin 2nx Vv g R. Then ker / is the additive group (Z, +) and Im / is the 
multiplicative group H of all complex numbers with modulus 1 . Hence the additive 
quotient group R/Z is isomorphic to the multiplicative group H. 

Problem If H is a non-normal subgroup of a group G, is it possible to make the 
set {aH} of left cosets of H in G a group with the rule of multiplication aH bH = 
abH Wa, b e G1 

Solution (The answer is no.) If it was possible, then the mapping / : G — > G\ the 
group formed by the cosets, given by x xH , would be a homomorphism with H 
as kernel. But this is not possible unless H is normal in G. 

We now study a very important subgroup defined below which is closely associ- 
ated with given subgroup. 

Definition 2.6.4 If H is a subgroup of a group G, then the set N(H ) = {a e G : 
aHa~ l = H] = {a € G : aH = Ha} is called the normalizer of H and H is said to 
be invariant under each a e N(H). 

Proposition 2.6.2 If N(H) is the normalizer of a subgroup H of a group G, then 

(i) N(H) is a subgroup of G; 

(ii) H is a normal subgroup ofN(H); 

(iii) if H and K are subgroups of G, and H is a normal subgroup of K , then 
K c N(H); 

(iv) H is a normal subgroup ofGo N(H) = G. 

Proof (i) Since 1 e N(H), N(H) ± 0. Nowi e N(H) ^xH = Hx =>• x~\xH) = 
x~ l (Hx) =► H = (x~ l H)x => Hx- 1 =x~ 1 H. 

Then a, x e N(H) ^ aH = Ha andx~ { H = Hx~ l => (ax~ l )H = a(x~ l H) = 
a{Hx~ x ) = (aH)x- 1 = Hax -1 =>■ N(H) is a subgroup of G. 

(ii) and (iii) follow trivially. 

(iv) H is a normal subgroup of G =>> for each g g G, g/7 = /fg 4 GC Af(H). 
Consequently, G = A(/f ). The converse part follows from (ii). □ 
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In abelian groups every subgroup is normal but it is not true in an arbitrary non- 
abelian group. There are some groups in which no normal subgroup exists except 
its trivial subgroups. Such groups are called simple groups. 

Definition 2.6.5 A group G is said to be simple iff G and {1} are its only normal 
subgroups. 

Definition 2.6.6 A normal subgroup H of a group G is said to be a maximal normal 
subgroup iff H ^ G and there is no normal subgroup N of G such that H G jy C q 

Theorem 2.6.10 If H is a normal subgroup of a group G, then G/H is simple iff 
H is a maximal normal subgroup of G. 

Proof Let H be a maximal normal subgroup of G. Consider the canonical homo- 
morphism / : G G/H. As / is an epimorphism, f~ l of any proper normal 
subgroup of G/H would be a proper normal subgroup of G containing H , which 
contradicts the maximality of H . Consequently G/H must be simple. Conversely, 
let G/H be simple. If A is a normal subgroup of G properly containing H , then 
f(N) is a normal subgroup of G/H and if also N ^ G, then f(N) ^ G/H and 
f(N) / H , which is not possible. So no such N exists and hence H is maximal. □ 

Remark There was a long-standing conjecture that a non-abelian simple group of 
finite order has an even number of elements. This conjecture has been proved to be 
true by Walker Feit and John Thompson. 


2.7 Geometrical Applications 

2.7.1 Symmetry Groups of Geometric Figures in Euclidean Plane 


The study of symmetry gives the most appealing applications in group theory. 
While studying symmetry we use geometric reasoning. Symmetry is a common 
phenomenon in science. In general a symmetry of a geometrical figure is a one-one 
transformation of its points which preserves distance. Any symmetry of a polygon 
of n sides in the Euclidean plane is uniquely determined by its effect on its ver- 
tices, say {1,2, ... ,n}. The group of symmetries of a regular polygon of n sides 
is called the dihedral group of degree n denoted by D n which is a subgroup of S n 
and contains 2 n elements. D n is generated by rotation r of the regular polygon of 
n sides through an angle 2i x /n radians in its own plane about the origin in anti- 
clockwise direction and certain reflections s satisfying some relations (see Ex. 19 of 
Exercises-III). 

(a) Isosceles triangle. Figure 2.1 is symmetric about the perpendicular bisector 
ID. So the symmetry group consists of identity and reflection about the line ID. In 
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Fig. 2.2 Group of 
symmetries of equilateral 
triangle (S3) 
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terms of permutations of vertices this group is 



(b) Equilateral triangle. In Fig. 2.2, the symmetry consists of three rotations 
of magnitudes ^ =2n about the center G denoted by r\, r 2 and r 3 , re- 

spectively, together with three reflections about the perpendicular lines \D,2E,3E 
denoted by t \ , t ^ , and £ 3 , respectively. So in terms of permutations of vertices the 
symmetry group is 



(c) Rectangle (not a square). In this case, the symmetry group consists of the 
identity /, reflections r\, r 2 about DE, FG (lines joining the midpoints of opposite 
sides), respectively, and reflection r^> about O (which is actually a rotation about O 
through an angle tv). In terms of permutations of vertices this group is 


/ = 


ri = 


2 3 
2 3 

1 2 
2 1 


3 4 

4 3 


)-=0 

)’ r3= 0 


2 3 

3 2 


2 

3 4 



which is Klein’s 4-group (see Ex. 22 of Exercises-III). 

(d) Square. In Fig. 2.3, the symmetry group consists of four rotations r\ , 7 * 2 , r ^ , r\ 
(= Identity I) of magnitudes ^ , jt, ^,2n about the circumcenter O respectively 
and four reflections t\,t 2 ,t^, £4 about DE, FG, 13, 24. 
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Fig. 2.3 Group of 
symmetries of the square 
(£> 4 ) 
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In terms of permutations of vertices involved here, this group is 


n = 


72 = 


1 2 
3 4 


4\ _(l 2 3 4\ 

2 j ’ V3 (,2 3 4 1 ) 


r 4 = 


t3 = 


2 3 
2 3 



A 2 3 4\ _/l 2 3 4\ 
\4 3 2 1 4 3 y ’ 


2 3 4\ _ ( 1 2 3 4\1 
4 3 2yA 4 \3 2 1 4y J ’ 


This group is called the group of symmetries of the square (also called octic 
group) and is the dihedral group D 4 (see Ex. 19 of Exercises-III). 


Remark In this group D 4 there are eight (hence octic) mappings of a square into 
itself that preserve distance between points and distance between their images, map 
adjacent vertices into adjacent vertices, center into center so that the only distance 
preserving mappings are rotations of the square about its center, reflections (flips) 
about various bisectors, and the identity map. 


2 . 7.2 Group of Rotations of the Sphere 

A sphere with a fixed center O can be brought from a given position into any other 
position by rotating the sphere about an axis through O . Clearly, the rotations about 
the same axis have the same result iff they differ by a multiple of 2tv. Thus if r(S) 
denotes the set of all rotations about the same axis, then we call the rotations r and r ' 
in r (S) equal or different iff they differ by a multiple of 2n or not. Clearly, the result 
of two successive rotations in r (S) can also be obtained by a single rotation in r (S). 
It follows that r(S ) forms a group. (The identity is a rotation in r(S) through an 
angle 0 and the inverse of a rotation r in r (S) has the same angle but in the opposite 
direction of r). Thus if any rotation r in r(S) has the angle of rotation 6 about the 
axis, then the map / : r(S) — >► S l defined by f(r) = e l 9 is a group isomorphism 
from the group r(S) onto the circle group S l (see Example 2.3.2). 

Remark The rotations of R 2 or R 3 about the origin are the linear operators whose 
matrices with respect to the natural basis are orthogonal and have determinant 1 (see 
Chap. 8). 
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Fig. 2.4 Structure of CH 4 


H 





2.7.3 Clock Arithmetic 

On a 24-hour clock, let the cyclic group (a) of order 24 represent hours. Then 
each of 24 numerals of the dial serves as representatives of a coset of hours. Here 
we use the fact that a 10 hour journey starting at 20 hrs (8 o’clock) ends at 30 
hrs, i.e., 10 + 20 = 30 = 6 o’clock (the following day). In shifting from a 24- 
hour clock to a 12 hour clock, we take the 24 hours modulo the normal subgroup 
H = {1, a 12 }. Then the quotient group (a) /H is a group of 12 cosets or a group of 
representatives: {a, a 2 , a 3 ,a 11 , a 12 }, where (a) = { a , a 13 }, (a 2 ) = {a 2 , a 14 }, ( a 3 ) = 
{a 3 , a 15 }, ( a n ) = {a 11 , a 23 } and (a 12 ) = {a 12 , a 24 }. A seven hour journey starting 
at 9 AM (9 PM) ends at 4 PM (4 AM). On the 12-hour clock we do not distinguish 
between the congruent elements in the same coset {4 AM, 4 PM}. 

A Note on Symmetry Group Group theory is an ideal tool to study symmetries. 
In this note we call any subset of R 2 or R 3 an object. We study orthogonal maps 
from R 2 to R 2 or from R 3 to R 3 and these are rotations or reflections or rotation- 
reflections. As orthogonal maps are only those linear maps which keep lengths and 
angles invariant, they do not include translations. 

Let A be an object in R 2 or R 3 . Then the set S(A) of all orthogonal maps g with 
g(X) = X is a group with respect to the usual composition of maps. This group 
is called the symmetry group of X. The elements of S(A) are called symmetry 
operations on X. Let S 2 (A) (or 53 (X)) correspond to the symmetric group of X in 
R 2 (or R 3 ). 

Examples (a) If X = regular n-g on in R 2 with center at (0, 0), then S 2 CX) = D n , 
the dihedral group with 2 n elements and for 53 (X), we also get the reflection about 
the vy-plane and its compositions with all g e S 2 (A), i.e., S 3 (A) = D n x Z 2 . 

(b) If X = regular tetrahedron with center at (0, 0, 0), then S 3 (A) = S 4 . 

(c) If A = cube, then S 3 (A) = S 4 x Z 2 . 

(d) If A = the letter M, then S 2 (A) = Z 2 and S 3 (A) = Z 2 x Z 2 . 

(e) If A = a circle, then S 2 (A) = 6 > 2 (R), the whole orthogonal group of all 
orthogonal maps from R 2 to R 2 . 

For applications to chemistry and crystallography, A is considered a molecule 
and S(A) depends on the structural configuration of the molecule A. 

For example, methane CH 4 (see Fig. 2.4) has the shape of a regular tetrahe- 
dron with H-atoms at the vertices and C-atom at its center (0, 0, 0). Then by (b), 
S 3 (CH 4 ) = S 4 . 
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Weyl (1952) first realized the importance of group theory to study symmetry in 
nature and the application of group theory was boosted by him. Molecules, crys- 
tals, and elementary particles are different, but they have very similar theories of 
symmetry. 

Applications of groups to physics are described in Elliot and Dawber (1979) and 
Sternberg (1994) and those to Chemistry in Cotton (1971) and Farmer (1996). 


2.8 Free Abelian Groups and Structure Theorem 

If we look at the dihedral group D n (see Ex. 19 of SE-III) we see that D n has two 
generators r and a satisfying some relations other than the associative property. But 
we now consider groups which have a set of generators satisfying no relations other 
than associativity which is implied by the group axioms. Such groups are called free 
groups. In this section we study free abelian groups and prove the ‘Fundamental 
Theorem of Finitely Generated Abelian Groups’ which is a structure theorem. This 
theorem gives notions of ‘Betti numbers’ and ‘invariant factors’. We also introduce 
the concept of ‘homology and cohomology groups’ which are very important in 
homological algebra and algebraic topology. 

A free abelian groups is a direct sum of copies of additive abelian group of inte- 
gers Z. It has some properties similar to vector spaces. Every free abelian group has 
a basis and its rank is defined as the cardinality of a basis. The rank determines the 
groups up to isomorphisms and the elements of such a group can be written as finite 
formal sums of the basis elements. 

The concept of free abelian groups is very important in mathematics. It has wide 
applications in homology theory. Algebraic topology is also used to prove some 
interesting properties of free abelian groups [see Rotman (1988)]. 

We consider in this section an additive abelian group G. 

Definition 2.8.1 Eet G be an additive abelian group and {G*} ze / be a family of 
subgroups of G. Then G is said to be generated by {G/ } iff every element x e G 
can be expressed as 

x =Xi l H b Xi n , where the additive indices i t are distinct. 

We sometimes use the following notation: 

x = x i > where we take xt =0, if i is not one of the indices i \ , *2, • • • , in- 
iel 

If the subgroups {G/ } generate G, we write 

n 

G = ^2 &i > i n general, and G = G/ , when I = {1 , 2, . . . , n}. 
iel i = 1 
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Fig. 2.5 Commutativity of 
the triangle 

hj\ / h 

\ / 

A 

Definition 2.8.2 If a group G is generated by the groups {G; }/ 6 / and every element 
x e G has the unique representation as a = ^2 ieI Xi , where xi = 0 for all but finitely 
many /, then G is said to be direct sum of the groups Gi , written 

n 

G = Gi , in general and G = ^ Gi, if I = {1 , 2, . . . , n}. 
iel i = 1 

Remark For n = 2, see Ex. 21 of Exercises-III. 

A Characterization of Direct Sums Let G be an abelian group and {G/ } be 
a family of subgroups of G. We say that G satisfies the extension condition (EC) 
for direct sums iff given any abelian group A and any family of homomorphisms 
hi : Gi A, there exists a unique homomorphism h : G — >► A extending hi, for 
each /, i.e., /z | G/ = hi, i.e., making the diagram in Fig. 2.5 commutative, where 
ki : Gi ^ G is an inclusion map for each i . 

Proposition 2.8.1 Let G be an abelian group and {Gi} be a family of subgroups 
of G. 

(a) If G = 0 Gi, then G satisfies the condition ( EC) ; 

(b) if G is generated by {Gi} and G satisfies the condition (EC), then G = ® G z -. 

Proof (a) Let G = 0G*. Then for the given homomorphisms hi = Gi -> A, we 
define a homomorphism h : G ^ A as follows. 

If x = (unique finite), then the homomorphism h given by h(x) = J] hi (jq) 

is well defined and is our required homomorphism. 

(b) Let x = Xi = ^2 yi • Given an index j , let A = Gj . 

Define hi \ Gi ^ G j as follows: 

/**• = id, if i = j; 

= 0 , 

Let h : G — >► Gj be the homomorphism extending each /^- by the condition (EC). 
Then 

h(x) = y^hi(xi) =xj and h(x) = ^ = >’/• 
ie/ /£/ 

Hence =yj ^ x has a unique representation =>► G = ® Gi . □ 

Corollary 1 G = // ® 7T, where H and K are subgroups of G such that H = 
0 /€/ Gi and K = ©y G / Gj and I (T / = 0. 77z^ G = ® fG/U y G ? . 
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Proof Let A be an abelian group. If hi : Gi -> A and hj : Gj — >► A are families 
of homomorphisms, then by Proposition 2.8.1, they can be extended to homomor- 
phisms / : H -> A and g : K -> A, respectively. Again / and g can be extended to 
a homomorphism /* : G -> A by Proposition 2.8.1. Hence G = ® r€/u/ G^ . □ 

Corollary 2 (Gi 0 G2) 0 G3 = G\ 0 (G2 0 G3) = Gi 0 G2 0 G3 /or ony •smZ?- 
groups G 1, G2 G3 0/0 group G. 

Corollary 3 For any group G and subgroups G\ and G2, G = Gi 0 G2 
G/G 2 = Gi. 


Proof Let A = G\ and h\ : Gi — > A = Gi be the identity homomorphism and /?2 : 
G 2 A be the zero homomorphism. If h : G — >► A is the homomorphism extending 
h\ and / 12 , then h is an epimorphism with ker h = G 2 . Hence G/G 2 = Gi . □ 

We are now in a position to study free abelian groups. Let G be an additive group. 
Then (m + n)x = mx + nx Vx g G and Vm, n eZ. 

If the group G is abelian, then n(x + y) = nx + ny, Vx, y g G and Vne Z. 

If g (^0) is a subset of G, then the subgroup (S) generated by S is given by 

(S) = {^ 1^1 + n2S2 H h : n/ G Z, S[ e 5}. 

In particular, if S = {.s*}, then (s) = {ns : n G Z} is the cyclic group generated by 
s and if 5 = 0, then ( 5 ) = { 0 }. The subset S is said to be independent iff J2 n iSi = 
0 = 0, V/. 


Definition 2 . 8.3 Let G be an additive abelian group. The group G is said to be a 
free abelian group with a basis 5 iff 

(i) for each b e B, the cyclic subgroup (&) is infinite cyclic; and 

(ii) G = ®,, eZ; (/?) (direct sum). 

Remark A free abelian group G is a direct sum of copies of Z and every element 
x g G can be expressed uniquely as x = J2 n bb, where nt e Z and almost all 
(i.e., all but a finite number of nf) are zero. 

Using the extension condition (EC) for direct sums, the following characteriza- 
tion of free abelian groups follows. 

Proposition 2 . 8.2 Let G be an abelian group and {bi } be a family of elements ofG 
that generates G. Then G is a free abelian group with a basis {bi } iff for any abelian 
group A and any family {at} of elements of A, there is a unique homomorphism 
h : G — > A such that h(bi) = at for each i . 

Proof Let Gi = {bi). Then {Gi} is a family of subgroups of G. First suppose that 
the condition (EC) holds. We claim that each Gi is infinite cyclic. If for some in- 
dex /, the element bj generates a finite cyclic subgroup of G, then taking A = Z, 
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there exists no homomorphism h : G —> A, mapping each bt to the number 1. This 
is so because bj is of finite order but 1 is not of finite order in Z. Hence by Proposi- 
tion 2.8.1(b), G = ®G(. 

Conversely, let G be a free group with a basis {bi}. Then given elements {at} 
of A, 3 homomorphisms hi : Gi A such that hi(bt) = at , because Gi is infinite 
cyclic. Hence the proof is completed by using Proposition 2.8.1(a). □ 

Theorem 2.8.1 If G is a free abelian group with a basis B = [b \ , £ 2 , • • • , b n } 9 then 
n is uniquely determined by G. 

Proof By hypothesis, G = Z®Z-| ® Z (n summands). Then 2 G is a subgroup 

of G such that 

2G = 2Z®2ZH f2Z (n summands). 

Hence G/2G = Z/2Z ® Z/2Z ® • • • ® Z/2Z (n summands) shows that card 
(G/2G) = 2 n . □ 

Definition 2.8.4 An abelian group G is said to be a finitely generated free abelian 
group iff G has a finite basis. 

Remark The basis of a finitely generated free abelian group is not unique. For ex- 
ample, {(1, 0), (0, 1)} and {(—1, 0), (0, —1)} are two different bases of G = Z ® Z. 

Corollary 1 Let G be a finitely generated abelian group. Then any two bases of G 
have the same cardinal number. 

Proof Let B = {b\,b 2 , . . . ,b n } and X = {x \ , X 2 , . . . x r j be two bases of G. Then by 
Theorem 2.8.1, card (G/2G) = 2 n and also card (G/2G) = 2 r . Hence n = r. □ 

For more general result, like vector spaces, we prove the following theorem. 

Theorem 2.8.2 Any two bases of a free abelian group F have the same cardinality. 

Proof Let B and C be two bases of F . Given a prime integer p > 1, the quotient 
group F/pF is a vector space over the field Z p (see Chap. 8 and Example 4.1.5 
of Chap. 4). Hence the cosets { b + pF : b e B) form a basis =>► dimz (F / pF) = 
card B. Similarly, dimz p (F / pF) = card C. Hence card B = card C. □ 

Remark If V = F/pF is infinite dimensional, Zorn’s Lemma is used to prove the 
existence of a basis of V and then using the result that the family of all finite subsets 
of an infinite set B has the same cardinality as B the invariance of the cardinality of 
a basis is proved. 

Definition 2.8.5 Let F be a free abelian group with a basis B . The cardinality of B 
is called the rank of F, denoted rank F. In particular, if F is finitely generated, then 
the number of elements in a basis of F is the rank F. 
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Remark Rank F is well defined by Theorem 2.8.2. 

Like vector spaces free abelian groups are characterized by their ranks. More 
precisely, two free abelian groups are isomorphic iff they have the same rank. Let G 
be an arbitrary abelian group. We define its rank as follows: 

Definition 2.8.6 An abelian group G has rank r (possibly infinite) iff 3 a free 
abelian subgroup F of G such that 

(a) rank F = r ; 

(b) G/F is a torsion group (i.e., every element of G/F is of finite order). 

Existence of F: Let B be an independent subset of G. Then the subgroup ( B ) 
generated by B is abelian and free with a basis B . Let S be a maximal independent 
subset of G, which exists by Zorn’s Lemma. Then F = (S) is a. free abelian group 
and G/F is a torsion group. 

Remark Rank F precisely depends on G, since rank G = dimQ(Q 0 G), where 
Q 0 G is a vector space over Q [see Chaps. 8 and 9]. Hence rank G is well defined. 

Sometimes, given an arbitrary family of abelian groups {Gi }, we can find a group 
G that contains subgroups //; isomorphic to the group Gi , such that G is isomorphic 
to the direct sum of these groups. This leads to the concept of the external direct sum 
of groups. 

Definition 2.8.7 Let {Gi} be a family of abelian groups. If G is an abelian group 
and fi : Gi -> G is a family of monomorphisms, such that G is the direct sum of 
the groups f (Gi). Then we say that G is the external direct sum of the groups Gi, 
relative to the monomorphisms f : Gi G. 

We prescribe a construction of G. 

Theorem 2.8.3 Given a family of abelian groups {G z }/ e /, there exists an abelian 
group G and a family of monomorphisms f : Gi —> G such that G = 0 f (Gi). 

Proof We consider the cartesian product Yliei • b * s an abelian group under com- 
ponentwise addition. Let G be the subgroup of the cartesian product consisting of 
those tuples (x;); e / such that x; = o/, for all but a finitely many values of i. Given 
an index j, define fj.Gj^G by fj(x) be the tuple that has x at its jth coordi- 
nate and 0 at its i th coordinate for i / j. Then fj is a monomorphism. Since each 
element x e G has only finitely many non-zero coordinates, x can be expressed 
uniquely as a finite sum of elements from the groups fj (Gj). □ 

A Characterization of External Direct Sums of Groups Like extension con- 
dition (EC) of direct sums, extension condition (ED) for external direct sums is 
defined. 
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Fig. 2.6 Commutativity of 
the triangle for external direct 
sum 


Let {Gi}t e i be a family of abelian groups, G be an abelian group and fi : Gi — > 
G be a family of homomorphisms. Then G is said to satisfy the condition (ED) 
iff given any abelian group A and a family of homomorphisms hi : Gi — > A, 3 a 
unique homomorphism h : G ^ A making the diagram in Fig. 2.6 commutative, 
i.e., ho fi = hi , V/. 

Proposition 2.8.3 {G*} a family of abelian groups and A be an abelian 

group. 

(a) If each f : Gi —> G is a monomorphism and G = @- e/ f (Gi), then G satisfies 
the condition (ED). 

(b) Conversely , if {fi(Gi)} generates G and G satisfies the condition (ED), then 
each f : Gi —> G is a monomorphism and G = @- e/ fi (Gi). 

Proof Left as an exercise. □ 

We now prove the following structure theorem for finitely generated abelian 
groups and also prove as its corollary the structure theorem of an arbitrary finite 
abelian group (also known as fundamental theorem of finite abelian groups). 

Theorem 2.8.4 (Fundamental theorem for finitely generated abelian groups) Every 
finitely generated abelian group can be expressed uniquely as 

r summands 

G = Z © Z © • • • © Z©Z ni © Z n2 © • • • © Z nt 

for some integers r,n\,n 2 , . . . n t such that 

(i) r > 0 and nj >2, V j; and 

(ii) ni\n i+ i,for 1 < i < t - 1. 

Proof As finitely generated Z-modules are finitely generated abelian groups, the 
theorem follows from Theorem 9.6.5 of Chap. 9 by taking R = Z. □ 

Definition 2.8.8 The integer r in Theorem 2.8.4 is called th e free rank or Betti 
number of the group G and the integers n\,U 2 , . . . ,ni are called the invariant fac- 
tors of G. 

Remark Betti numbers are very important in mathematics. For example, two closed 
surfaces are homeomorphic iff their homology groups have the same Betti numbers 
in all dimensions [see Massey (1991)]. 
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Theorem 2.8.5 Two finitely generated abelian groups are isomorphic iff they have 
the same rank and the same invariant factors (up to units). 

Proof It follows from Theorem 9.6.1 1 of Chap. 9 by taking R = Z. □ 

Remark If we write F(G) = Z ® • • • ® Z (r summands) and T (G) = Z Wl ® • • • ® Z Wf 
in Theorem 2.8.4, then G = F(G) ® T (G). F(G) is called a free subgroup of G and 
T (G) is called a torsion subgroup of G. 

Corollary 1 (Structure Theorem for finite abelian groups) Any finite abelian group 
G can be expressed uniquely as G = Z„ 2 ® • • • ® 7j nt such that ni \ni+\,for 1 < i < 
t-l. 


Proof The finite abelian group G is certainly finitely generated (which is generated 
by the finite set consisting of all its elements). Hence, in this case, F(G) = 0. □ 

Corollary 2 Two finite abelian groups are isomorphic iff they have the same in- 
variant factors. 

Supplementary Examples (SE-IB) 

1 For any abelian group G, the following statements are equivalent: 

(a) G has a finite basis; 

(b) G is the internal direct sum of a family of infinite cyclic groups; 

(c) G is isomorphic to a finite direct sum of finite copies of Z. 

[Hint, (a) =>► (b). Let B = {b\, b 2 , . . . , b t ] be a basis of G. Let nbt = 0 for some 

n e Z =>► (9&i H b ftZ?; H = 0 =>► ft = 0 =>► bi is of infinite order =>► (£/) is 

an infinite cyclic group for 1 < i < t G = ® - =1 (£>; ) . 

(b) =>> (c). Let G = ®- =1 Gi , where each G; = Z. This implies (c). 

(c) =>► (a). Let G = Z ® Z H b ®Z = Z r (say) be a finite direct sum of t copies 

of Z and / : G — > 71 be an isomorphism. 

Let ei = (0, . . . , 0, 1 , 0 . . . , 0) e 71 , where 1 is at the i the place. 

Then 3b i e G, such that f(bi) = ^ for each i. 

Clearly, B = (b \ , bi , . . . , b t ) forms a basis of G.] 

2 Let F be a free abelian group with a basis B and G be an abelian group. 

(a) For every group / : B — >► G, 3 a unique group homomorphism / : F — > G such 
that/(fc) = /(fc), Vfcefl. 

(b) For every abelian group G, 3 a free abelian group F such that G is isomorphic 
to a quotient group of F. 

[Hint, (a) x e F => 3nt e Z such that x = ^fn^b. 

Define / : F — >► G, by /(x) = ^2ntf(b). Unique expression of x => f is well 
defined. 
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Clearly, / is a homomorphism such that / is unique, because homomorphisms 
agreeing on the set of basis are equal (like linear transformations of vector spaces). 

(b) For each g e G, take an infinite cyclic group generated by b g (say). Then 
F = (Bg e G(bg) is a free abelian group with a basis B = {b g : g e G}. Define a map 
/ : B -> G, b g i-> g and extend it by linearity. 

Then by (a), 3 a unique homomorphism / : F — > G such that f(b g ) = 
f(b g ) = g. Hence / is an epimorphism =>► G = F/ker/.] 

3 Every finitely generated abelian group G is the homomorphic image of a finitely 
generated free abelian group F. 

[Hint. Let G = ( X ), where X = {xi, X 2 , , x t } and B = {Zq, £ 2 , • • • , M be a 

basis of F. 

Define a map / : F — > G, bi h-> x; and extend it by linearity. Then / is well 
defined and an epimorphism => G = /(F)]. 

4 Let F = (x) be a free abelian group. Then F/(nx) = Z w , for all integers n > 0. 

Define map / : F — > Z w , mi t-^ [m], Vra G Z. Then / is an epimorphism 
with ker/ = (nx) F/{nx) = Z n .] 


2.8.1 Exercises 

Exercises-II 

1. For a given non-empty set S , define a word to be a finite product x” 1 x^ 2 • • • , 

where x; g S and g Z. The word is said to be reduced iff x/ ^ x/+i and n\ s 
are non-zero. We can make a given word reduced by collecting up powers of 
the adjacent elements x z - and omitting the zero powers and continue the process 
according to the necessity. 

We consider the word x^ as the empty word. Let F(S) denote the set of all 
reduced words on S. If 5 = 0, F(5) is taken to be the trivial group. If 5^0, 
we define a binary operation on F(5) by concatenation of reduced words, i.e., 
(x^ 1 • • -xfOCyf 1 • • • yT r ) = x^ 1 • • -x^y^ 1 • • • y™ r — w (say) (making the word 
reduced). 

Show that 

(a) F(5) becomes a group under this operation with the empty word as its iden- 
tity element and x^ nt • • • x^ ni as the inverse of the reduced word x\ x • • • x ^ ; 

(b) \X\ = \S\^F(X) = F(S); 

(c) The free group F({x}) is the infinite cyclic group having only possible non- 
empty reduced words are the powers x r ; 

(d) If S consists of more than one element, then F(5) is not abelian. 

2. Show that the following statements on an abelian group G are equivalent: 

(a) G has a non-empty basis; 

(b) G is the internal direct sum of a family of infinite cyclic subgroups; 
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(c) G is isomorphic to a direct sum of copies of Z; 

(d) for a subset S and a map g : S G such that given an abelian group A and a 
map h : S — » A, 3 a unique homomorphism / : G A satisfying f og — h. 

3. Show that given an abelian group A, 3 a free abelian group G and an epimor- 
phism / : G — > A. 

[Hint. Show that any abelian group is the homomorphic image of a free 
abelian group.] 

4. Let G and H be free abelian groups on the set S' w.r.t. maps g : S' — > G and 
h : S ^ H respectively. Show that 3 a unique isomorphism / : G — > H such 
that f o g — h. 

5. If A is a free abelian group of rank n , then any subgroup B of A is a free abelian 
group of rank at most n. 

[Hint. Let A = Z©Z©---©Z(ft summands) and Hi : A — > Z be the projec- 
tion on the zth coordinate. 

Given £ < n, let B t = {/? e 5 : 7T; (Z?) =0, V/ > £}. Then is a subgroup of 
B =>- JT t (B t ) is a subgroup of Z. If n t (B t ) ^ 0, take x t e B t such that n t (x t ) is a 
generator of this group. Otherwise, take x t = 0. Then 

(i) B t = {x\, X 2 , x t ), for each t; 

(ii) the non-zero elements of {x\, X 2 , . . . , x t } form a basis of B t , for each t; 

(iii) B t = B is a free abelian group with rank B at most n.] 

6. Homology and cohomology groups (see Sect. 9.11 of Chap. 9). A sequence {C n } 

of abelian groups and a sequence {d n } of homomorphisms d n : C n — > C n -\ such 
that d n -\ o d n = 0, Vn G Z, are called a c/iam complex C = (C n , d n ) of abelian 
groups. Given two chain complexes C = (C n , d n ) and C' = (C' , 3'), a sequence 
/ = {/„} of homomorphisms f n • C n — > such that / n _i o 3^ = 3^ o /„, Vne 
Z, is called a c/iam / : C — > C f . The elements of Z n = ker3 n are called 
n-cycles , elements of = Im3^ + i are called n-houndaries. f = {/^} is said to 

be an isomorphism iff each f n is an isomorphism of groups. Then the quotient 
group Z n /B n , denoted H n (C ) exists and is called the nth homology group of 
C. Moreover each f n induces a group homomorphism f n * : H n (C) — > H n (C f ) 
defined by f n * ([ z ]) = [/(z)], V[z] e H n (C) and /* = {/^*} is said to be an 
isomorphism iff each f n * is an isomorphism. 

The nth cohomology group H n (C) is defined dually. 

(a) ( Eilenberg-Steenrod ). Let the groups in the chain complexes C and C' be 
free and / : C — > C' be a chain map. Let K = {K n = ker /„}. Show that /* 
is an isomorphism iff H n (K) = 0 Vn g Z. 

(b) Show that the complex C is exact (see Sect. 9.7 of Chap. 9) iff H(C) = 0. 

(c) Establish the dual results of (a) and (b) for H n (C). 

7. Using the techniques of algebraic topology, the following results of free groups 
can be proved [see Rotman (1988, p. 305)]: 

(a) Every subgroup G of a free group F is itself free; 

(b) a free group F of rank 2 contains a subgroup that is not finitely generated; 
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(c) let F be a free group of finite rank n , and let G be a subgroup of finite 
index j . Then G is a free group of finite rank; indeed rank G = jn — j + 1. 


2.9 Topological Groups, Lie Groups and Hopf Groups 

An abstract group endowed with an additional structure having its group operations 
(multiplication and inversion) compatible with the additional structure, invites fur- 
ther attraction. Some of such groups are studied in this section. For example, we 
study topological groups, Lie groups and Hopf groups which are very important 
in topology, geometry, and physics. Topological groups were first considered by S. 
Lie (1842-1899). A topological group is a topological space whose elements form 
an abstract group such that the group operations are continuous with respect to the 
topology of the space. A Lie group is a topological group having the structure of 
a smooth manifold for which the group operations are smooth functions. On the 
other hand, a Hopf group (//-group) is a pointed topological space with a contin- 
uous multiplication such that it satisfies all the axioms of a group up to homotopy. 
The concept of an //-group is a generalization of the concept of topological groups. 
Lie groups are an important branch of group theory. The importance of Lie groups 
lies in the fact that Lie groups include almost all important groups of geometry and 
analysis. The theory of Lie groups stands at the crossing point of the theories of 
differential manifolds, topological groups, and Lie algebras. 


2.9.1 Topological Groups 

Definition 2.9.1 A topological group G is a non-empty set with a group structure 
and a topology on G such that the function / : G x G -> G, (x, y) i-> xy~ l is 
continuous. 

The above condition of continuity is equivalent to the statement: 

The functions G x G — > G, (v, y) t-> vy and G — > G, x i-> x~ l are both con- 
tinuous. 

Some important examples of topological groups. 

Example 2.9.1 (i) (R, +), under usual addition of real numbers and with the topol- 
ogy induced by the Euclidean metric d(x,y) = \x — y\. 

(ii) The circle group (S' 1 , •) in C, topologized by considering it as a subset of R 2 . 

(iii) ( R n , +), under usual coordinatewise addition and with product topology. 

(iv) ( GL{n , R), •), under usual multiplication of real matrices and with the Eu- 

2 

clidean subspace topology of R n (n > 1). 

(v) The orthogonal group ( 0{n ), •) of real matrices with the Euclidean subspace 
topology (n > 1). It is a subgroup of GL(n , R). 
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(vi) The general linear group ( GL(n , C), •) over C topologized by considering it 
as a subspace of R 2n . 

(vii) (U(n), •) of all n x n complex matrices A such that AA T = I . It is a sub- 
group of GL(n, C). 


2.9.2 Lie Groups 

A Lie group is a topological group which is also a manifold with some compati- 
bility conditions. Lie groups have algebraic structures and they are also subsets of 
well known spaces and have a geometry, moreover, they are locally Euclidean in the 
sense that a small portion of them looks like a Euclidean space making it possible to 
do analysis on them. Thus Lie groups and other topological groups lie at the cross- 
ing of different areas of mathematics. Representations of Lie groups and Lie alge- 
bras have revealed many beautiful connections with other branches of mathematics, 
such as number theory, combinatorics, algebraic topology as well as mathematical 
physics. 

Definition 2.9.2 A topological group G is called a real Lie group iff 

(i) G is a differentiable manifold; 

(ii) the group operations (x, y) i-> xy and xh^x - 1 are both differentiable. 

Definition 2.9.3 A topological G is called a complex Lie group iff 

(i) G is a complex manifold; 

(ii) the group operations (x, y) i-> xy and xh^x - 1 are both holomorphic. 

The name Lie group is in honor of Sophus Lie. The dimension of a Lie group is 
defined as its dimension as a manifold. 

Some Important Examples of Lie Groups 

Example 2.9.2 (i) (R n , +) is an n -dimensional Lie group over R. 

(ii) ( C n , +) is an n-dimensional Lie group over C, but it is a 2n -dimensional Lie 
group over R. 

(iii) GL(n , R) is a real Lie group of dimension n 2 . 

(iv) GL(n, C) is a complex Lie group of dimension n 2 . 

(v) SL(n, R) is a real Lie group of dimension n 2 — 1. 

(vi) SL(n, C) is a complex Lie group of dimension n 2 — 1. 


2.9.3 Hops ’s Groups or H -Groups 

A pointed topological space is a non-empty topological space with a distinguished 
element. 
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Let P and Q be pointed topological spaces and f,g:P — > Q be base point 
preserving continuous maps. Then / and g are said to be homotopic relative to 
the base point po e P, denoted by / ~ grel po, iff there exists a continuous one- 
parameter family of maps 

such that /o = and f,(po) = /(po) = g(po) for all t e I = [0, 1], f, is 

called a homotopy between / and g relative to po, denoted by f t : / — grel po . 

Since is an equivalence relation; one can speak of homotopy class of maps 
between two spaces. Thus [P ; Q ] usually denotes the set of all homotopy classes of 
base point preserving continuous maps P -> Q (all homotopies are relative to the 
base point). 

Definition 2.9.4 A pointed topological space P is called a Hopf group or H -group 
iff 3 a continuous multiplication p : P x P — > P such that 

(i) n(c, l/>) ~ l p ~/x(l/>,c); 

(ii) x lp) ~p(l/> x p.); 

(iii) p(lp,/)~c~p(/, lp), 

where c : P — > /?o £ P is the constant map (/?o is the base point of P), lp : P ^ P 
is the identity map and 0 : P — >► P is a continuous map and 0 is called a homotopy 
inverse for P and \i. Then the homotopy class [ c ] of c is a homotopy identity and 
[0] is homotopy inverse for P and p. 

Clearly, any topological group is an //-group with identity element e as a base 
point. 

Theorem 2.9.1 Let X be an arbitrary pointed topological space and P be an H- 
group. Then [ X ; P] is a group. 

Proof Define for every pair of maps g\, g 2 : X — >► P, the product g\g 2 : X — >► P 
by (gig2)(*) = hL(g\(x), g2(x)) = where the right hand multiplication 

is the multiplication p in the //-group P . This law of composition carries over 
to give an operation on homotopy classes such that [g\][g2] = [gig2]- As the two 
homotopic maps determine the same homotopy class, the group axioms for [X; P] 
follow from (i)-(iii) of Definition 2.9.4. □ 

Theorem 2.9.2 If f : X — >► Y is a base point preserving continuous map , then f 
induces a group homomorphism 

/* : [Y; P] -> [X; P] for each H-group P. 

Proof Define /* by f*[h] = [h o /]. This map is well defined, since ho — h\ => 
hoof — h\of (cf. Spanier 1966, Th. 6, p. 24). Verify that /* is a group homomor- 
phism. □ 
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Let P and P' be H-g roups with multiplications [i and / 1 ', respectively. Then 
a continuous map a \ P ^ P’ is called a homomorphism of //-groups, iff afi ~ 
x a). 

Theorem 2.9.3 If a : P ^ P f is a homomorphism between H -groups, then a in- 
duces a group homomorphism a* : [A; P] —> [A; P']. 

Proof Define a* : [A; P] -> [A; P'] by a*[/] = [a o /]. Then a* is well defined 
and a homomorphism. □ 

Remark For further results, see Appendix B. 


2.10 Fundamental Groups 

Homotopy theory plays an important role in mathematics. In this section, we define 
fundamental group of a topological space A based at a point xo in X. This group is 
an algebraic invariant in the sense that the fundamental groups of two homeomor- 
phic spaces are isomorphic. By using the fundamental groups many problems of 
algebra, topology, and geometry are solved by reducing them into algebraic prob- 
lems. For example, the fundamental theorem of algebra is proved in Chap. 11 by 
homotopy theory and the Brouwer Fixed Point Theorem for dimension 2 is proved 
in Appendix B by the fundamental group. 


Definition 2.10.1 Let A be a topological space and xo be a fixed point of X , called 
a base point of A. A continuous map /:/—>► A (where / = [0, 1]) is called a path 
in A; /( 0) and /( 1) are called the initial point and the terminal point of the path 
/, respectively. If /(0) = /(l) = xo, the path / is called a loop in A based at xo. 
A space A is said to be path connected , iff any two points of A can be joined by 
a path. Let f, g : I — > A be two loops based at xo. Then they are called homotopic 
relative to 0 and 1, denoted by f — g rel{0, 1} iff 3 a continuous map F 1 : / x / -> A 
such that 


Fit , 0) = fit), Fit , 1) = git), F(0, s) = Fil,s) = x 0 'it, s € /. 


The above homotopy relation between loops is an equivalence relation, and this 
gives the set of homotopy classes relative to 0 and 1 of the loops based at xo and this 
is denoted by n\ (A, xo). Given loops /,#:/—> A based at xo, their product / * g 
is a loop / * g : / — > A at xo defined by 


if*g)it) = 


f (20, 0 <t<\, 

g(2t-\), \<t< 1. 


If [/] and [g] are two elements of 7Ti(A, xo), define their product [/] o [g] = 
[f * g]. This product is well defined and tz\ (A, xo) is group under this composition. 
This group is called the fundamental group of A based at xq. 
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Clearly, if X is path connected, ii\ ( X , xo) is independent of its base point xo. 

We now find geometrically the fundamental group of the unit circle S l which 
is the boundary of the unit ball or disc E 2 in R 2 . Since S l is path connected, its 
fundamental group Tt\(S l ,x o) is independent of its base point xo. A loop / in S 1 
based at xo is a closed path starting at xo and ending at xo. Then / is either a null- 
path (constant path) or / is given by one or more complete description (clockwise 
or anti-clockwise) around the circle. Let / and g be two loops in S 1 based at the 
same point xo such that / describes S 1 , m times and g describes S l , n times. If 
m > n, then / * g~ l is a path describing (m — n) times the circle S l and such that 
it is not homotopic to a null path. Thus the homotopy classes of loops in S l based 
at xo are in bijective correspondence with Z. Hence n\ (S l , xo) is isomorphic to Z. 

Proceeding as above, for a solid disc E 2 the fundamental tz\(E 2 , xo) = 0, since 
the space E 2 is a convex subspace of R 2 . 

For an analytical proof use the degree function d defined in (v) below or [see 
Rotman (1988)]. 

Let I = {0, 1} be the end point of the closed interval I = [0, 1]. Then a loop in 
S' 1 based at 1 is a continuous map / : (/, /) -> (S' 1 , 1). Let p : (R, ro) (S 1 , 1) be 
the exponential map defined by pit ) = e 2nit , Vt e R, where ro e Z. We now present 
some interesting properties of fundamental groups (see Rotman (1988)). 

(i) There exists a unique continuous map / : I — > R such that p o f = / and 

/(°) = °- 

(ii) If g : (/, I) — >► (S 1 , 1) is continuous and f — g relative to /, then f — g relative 
to I and /( 1) = g(l), where p o g = g and g(0) = 0. 

(iii) If / : (/, I ) — > (S 1 , 1) is continuous, the degree of / denoted by deg / is 
defined by deg / = /( 1), where / is the unique lifting of / with /( 0) = 0. 
Then deg / is an integer. 

(iv) If /(x) = x", then deg f = n and the degree of a constant loop is 0. 

(v) The function d : tx\ (S 1 , 1) — > Z, defined by d([f]) = deg /, is an isomorphism 
of groups. 

(vi) Two loops in S 1 at the base point 1 are homotopic relative to I if and only if 
they have the same degree. 


2.10.1 A Generalization of Fundamental Groups 


Rodes (1966) introduced the fundamental group a(X,x o, G) of a transformation 
group ( X , G, a) (see Definition 3.1.1), where A is a path connected space with xo 
as base point. Given an element g e G, a path a of order g with base point xo is a 
continuous map a : I — > X such that a(0) = xo, a(l) = gxo. A path oq of order g\ 
and a path of order g 2 give rise to a path a\ + got 2 of order g\g 2 defined by the 
equations 


(«1 +ga2)(s) = 


oq(2 s), 

gia2(2s 


0<s<2, 

1), !<s<i. 
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Two paths a and /3 of the same order g are said to be homotopic iff 3 a continuous 
map F : I x I -> X such that 

F(s, 0) = a(s), 0<s<\; 

F(s,l) = P(s), 0 < 5 < 1; 

F( 0, t) = x o, 0 < t < 1; 

E(l, t) = gx o, 0 < f < 1. 

The homotopy class of a path a of order g is denoted by [a; g]. Prove that the set of 
homotopy classes of paths of prescribed order with rule of composition ‘o’ form a 
group where o is defined by 

[a; gi] ° IP', gi\ = [a + gi/3; gig2]. 

This group denoted by a(2f, xo, G) is called the fundamental group of ( X , G, a) 
with base point xo- 

Problems and Supplementary Examples (SE-I) 

1 Show that (Q, +) is not finitely generated (+ denotes usual addition of rational 
numbers). 

Solution If possible, suppose (Q, +) is finitely generated. Then there exists a finite 
set 

s = | PI P2 Pn\ 

1 q\ qi ’ ’ ” ’ qn J 

of rational numbers such that Q = (S ) . Now we can find a prime p such that p does 
not divide q \ , q 2 , . . . , q n - Let xeQ. Then there exist integers m \ , m 2 , . . . , m n such 
that 

P\P2 Pn m 

x = m 1 1- m 2 1 Ym n — = 

q\ q 2 qn qiq 2 ---q n 

for some integer m. Since p does not divide qi,q2, • • • ,qn> P does not divide 

q\q2-- - qn - Hence p does not divide the denominator of any rational number (ex- 
pressed in lowest terms) of (S). This shows that ^ g (S) = Q. This yields a contra- 
diction. Hence (Q, +) is not finitely generated. 

2 Let S n be the symmetric group on {x\, X 2 , . . . , x n ) with identity e. For i = 
1,2 , ,n — 1, write cr* for the permutation of S which transposes x; and x / + 1 
and keep the remaining elements fixed. Then show that 

(i) <7? = e for / = 1 , 2, . . . , ft — 1 ; 

(ii) OiOj = ojoi if |i — j | >2; 

(iii) (o’i<j/+i) 2 3 = e for 1 < i < n - 2; 

(vi) o 1 , tJ 2 , . . . , (J n —i generate the group S n . 

(v) The relations (i)— (iii) between the generators determine the Cayley table of S n 
completely (these relations are called the defining relations of S n ). 
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3 Consider as a conjecture. “Let H be a subgroup of a finite group G. If H is 
abelian, then H is normal in G.” Disprove the conjecture by reference to the Octic 
group and one of its subgroups. 

4 Show by an example that the converse of Lagrange’s Theorem is not true for 
arbitrary finite groups. 

Solution We show that the alternating group A 4 of order 12 has no subgroup H of 
order 6. If possible, let \H\ = 6. All the following eight 3-cycles (1 2 3), (1 3 2), 
(1 2 4), (1 4 2), (1 3 4), (1 4 3), (2 3 4), (2 4 3) are in A 4 . As \H\ = 6, H cannot 
contain all these elements. Let p = (x y z) be a 3-cycle in A 4 such that p $ H. Let 
K — {P, p 2 , P 3 = 1}. Then K is a subgroup of A 4 . Hence H Pi K = {1} shows that 
HK = ™ = 18. But HK c A4 implies a contradiction. 

Problems and Supplementary Examples (SE-II) 

1 Let G be a group and H be a subgroup of G. Show that Vg e G, gH = H o 
gcH. 

Solution Let g g G and gH = H . Then g = g • 1 g gA = H ^ g e H. Thus 
Vg g G, g/f = H ^ g e H . Conversely, suppose g e H. Then for V/z e H, gh e 
H^gH^H. Again g,heH^ g~ l h e H =$> h = g(g~ l h ) egH ^ H <^gH. 
Thus for any g g H, gH = H. 

2 Show that any factor group of a cyclic group is cyclic. 

Solution Let G = (a) be a cyclic group and A be a normal subgroup of G. Then 
computation of all powers of a N amounts to computing in G, all powers of the 
representative a and these powers give all the elements of G. Hence the powers of 
a N give all the cosets of N and thus G/N = {a N) G/N is cyclic. 

3 Find the kernel of the homomorphism: / : (R, +) — > (C*, •) defined by /(x) = 
e lx and hence find the factor group R/ker /. 

Solution 


ker / = {igR:^ = 1} = {igR: cosx + i sinx = 1} 

= {x g R : x = 2 nn, n g Z} = (2tt). 

Thus R/ker / = R/(27 t) = lmf = S l (circle group in C*, see Example 2.3.2). 

4 Show that a group having only a finite number of subgroups is finite. 

Solution If G = {1}, then G is finite. Suppose G 7^ {1}. Then 3 an element 
a (7^1) G G. Thus A = (a) is a finite subgroup of G. 
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Otherwise, (H , •) = (Z, +) and (Z, +) has infinite number of subgroups viz. (n) 
(by Theorem 2.4.10), n = 0, 1, 2, . . . would imply that H has an infinite number 
of subgroups contradicting our hypothesis. Thus 0(a) is also finite. If G = (a), 
the proof ends. If G ^ (a), then 3 only a finite number of elements in (G — (a)). 
Otherwise, for each g e (G — (a)), (g) is a subgroup of G =>- 3 an infinitely many 
non-trivial subgroups of G =>► a contradiction of hypothesis. Consequently, G is 
finite. 

5 Let G be a finite group and H be a proper subgroup of G such that for x,y e 
G — H, xy e H . Prove that H is a normal subgroup of G such that \G\ is an even 
integer. 

Solution g e H => ghg~ l e H, Vh e H. Again g $ H =>► hg~ l $ H , Vh e H. This 
is so because hg~ l e H and h,h~ l e H => h~ x (hg~ l ) e H => g~ l e H => g e 
H a contradiction. Thus g, hg~ l e G — H ^ g(hg~ l ) e H by hypothesis 
ghg~ l e H. Hence it follows that H is a normal subgroup of G. Thus G/H is a 
group. We now show that \G\ is of even order. By hypothesis, 3x e G such that 
x ^ H but x 2 € H. Hence x 2 H = H => (. xH ) 2 = eH. Again x^H^xH^eH. 
Thus 0(xH) = \xH\ = 2. Now \xH\ divides \G/H\ => 2 divides \G/H\. Hence 
| G/H | is an even integer => \G\ = \ G/H\ • \H\ is an even integer. 

6 Let G be a finite group and H, K be subgroups of G such that K c H. Then 
[G :K] = [G :H][H : K]. 

Solution H = 1J- XiK and G = |J j yjH, where both are disjoint union => G = 
U / j yjXiK is also a disjoint union =>► [G : AT] = [G : 7Z][Z/ : AT]. 

7 Let G be a finite cyclic group of order n . Show that corresponding to each posi- 
tive divisor d of n, 3 a unique subgroup of G of order d. 

Solution Let G = (g) for some g e G and d a positive divisor of n. Then n = md 
for some me Z. 

Now g m e G 0(g m ) = = £ = d by Theorem 2.3.5. 

Let 7/ = (g m ). Then H is subgroup of G of order J. 

Uniqueness of H : Let be a subgroup of G of order d. Let f be the smallest 
positive integer such that g* e K. Then K = (g r ). Now |^T| = d =>► G(g0 = J =>► 
d = gcdS) = gcda.w) b y Theorem 2 - 3 - 5 gcdO, n) = | = m ^ m\t. If t = ml for 
some / e Z, then g l = g ml = ( g m )' & H^K^H^K = H, since |Z/| = 1^1. 

8 Let G be a finite group and / : G — >► Z15 be an epimorphism. Show that G has 
normal subgroups with indices 3 and 5. 

Solution G/ker/ = Z15 by the First Isomorphism Theorem. Then Z15 being a 
cyclic group of order 15, G/ker / is a cyclic group of order 15 => G/ker / has 
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a normal subgroup N\ of order 5 and a normal subgroup N 2 of order 3 =>- 3 normal 
subgroups H\ and H 2 of G such that ker / c H\, ker / c H 2 , Hi /ker/ = N\ and 
H 2 /ker/ = N 2 by the Correspondence Theorem. Hence 15 = |G/ker/| = [G : 
ker/] = [G : H x ][Hi : ker/] (by Ex. 6 of SE-II) = [G : H x ] ■ 5 =KG : #i] = 3. 
Similarly, [G : Hi] = 5. 

9 Prove that the group Z m x Z n is isomorphic to the group Z mm gcd(m, n) = 1 
i.e., Z m x Z n = Z m/1 <£>/??, ft are relatively prime. 

Solution Z m and Z n are cyclic groups of order m and ft, respectively. If 
gcd(/ft,ft) = 1, then Z m x Z H is a cyclic group of order mn and hence isomor- 
phic to Z mn (see Theorems 2.4.12 and 2.4.13). Conversely, let Z mn = Z m x Z n . 
If gcd(ftt, ft) = d > 1, then mn = d • / cm (/ft, ri) => ™ = l cm(m, ft) =>► ^ = £ is 
divisible by both ft? and ft. Now for (x, y) e Z m x Z (jc, y) + (jc, y) 4 — f- (v, y) (£ 
summand) = (0, 0) =>► no element (x, y) e Z m x Z n can generate the entire group 
Z m xZ„. Hence Z m x Z n cannot be cyclic and therefore cannot be isomorphic to 
Z mn . So we reach a contradiction. Hence gcd(ft?, ft) = 1. 

10 If ft = p” 1 P 2 2 • * • Pn r » where p/s are distinct primes, ft/ > 0, show that Z n is 
isomorphic to Z n p \ x Z n p2 x • • • x Z^ . 

Solution The result follows from Ex. 9 of SE-II by an induction argument. 

11 Prove that a finitely generated group cannot be expressed as the union of an 
ascending sequence of its proper subgroups. 

Solution Let G be a finitely generated group G = (a\, 02 , . . . , a n ). If possible, 
let G = |J t Ai, where A\ c A 2 c ••• and each A/ is a proper subgroup of 

G, / = 1,2, Then 3 a positive integer ^ such that ai,a 2 , ... ,a n e A t . Then 

G = (ft? , a 2 , . . . , a n ) c A/. Again A ( CG. Hence G = A ? a contradiction, since 
A r is a proper subgroup of G. 

12 Show that every finitely generated subgroup of (Q, +) is cyclic. 

Solution Let H be a finitely generated subgroup of (Q, +). Then any element x of 
H is of the form 


ft? 


for some integer m (proceed as Ex. 1 of SE-I) 


q\q2---qn 



1 


=>► K is a cyclic group generated by 




qxqi'-qn 


H is a cyclic group. 
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13 Prove that every finite subgroup of (C*, •) is cyclic. 

Solution Let ( H , •) be a finite subgroup of (C*, •)• Let \H\=n and x e H. Then 
x n = 1 . 

The nth roots of x n = 1 are given by 
, ? W -1 , 2&7T . 2kTt 

1, w, w , . . . , w , where w = cos h / sm , A; = 0, 1, 2, . . . , n — 1. 

Then each of the roots belongs to H and 0(w) — n — \ H\. Hence H = (w) => H is 
cyclic. 

14 Prove that the alternating group A n of order n (>3) is a normal subgroup of S n 
(see Ex. 58 of Exercises-III). 

Solution Consider the subgroup G = ({1, — 1}, •) of (R*, •)• Define a mapping x/r : 
S n ->G by 

{ 1 if a is an even permutation, 

— 1 if o is an odd permutation. 

Then \j/ is an epimorphism such that ker xj/ = [a e S n : \[r(cr) = 1} = A n . 

Hence A n is a normal subgroup of S n by Theorem 2.6.5. 

15 Let H be a normal subgroup of a group G such that H and G/H are both 
cyclic. Show that G is not in general cyclic. 

Solution Take G = S3 and H = {e, (1 2 3), (1 3 2)}. 

Then \G/H\ = 2^ H is 3. normal subgroup of G. Hence G/H is also a group. 
Since \G/H\ —2 and \H\ =3, which are both prime, G/H and H are both cyclic 
groups. But G is not commutative =>► S3 is not cyclic. 

16 Let G be a non-cyclic group of order p 2 ( p is a prime integer) and g(y^e) e G. 
Show that 0(g) = p. 

Solution 0(g)\\G\ = p 2 => 0(g) = 1, p or p 2 . Now g / e => 0(g) ^ 1. If 0(g) = 
p 2 , then G contains an element g such that G (g) = | G \ . Hence G is cyclic =>► a con- 
tradiction of hypothesis =>► G(g) ^ p 2 . Thus 0(g) ^ 1, 0(g) ^ p 2 ^ O(g) = p. 


2.11 Exercises 

Exercises-III 

1. Prove that a semigroup 5* is a group iff for each a e S, aS = Sa = S. 
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2. Let S be a semigroup. A non-empty subset A of S is called a pseudo-ideal (left, 
right) iff ab e A Va, b e A; v 2 A c A and Ax 2 c A(jc 2 A c A, Ax 2 c A) for 
every x € S. Show that a semigroup S' is a group iff x(A\£)x c A\B for every 
x e S when A\B ± 0, where A, B are either both left pseudo-ideals or both 
right pseudo-ideals of S. 

3. Show that (a) in the semigroup (Z*, •) (of all non-zero integers under usual 
multiplication), A = {a e Z* : a > 6} is a pseudo-ideal but not an ideal of Z*. 

In the multiplicative group (Q*, •) of all non-zero rational numbers, A = 
{r 2 : r e Q*} is a proper pseudo-ideal of Q*. 

(This example shows that a group may contain a proper pseudo-ideal.) 

4. Define a binary operation o on Z by aob = a + b — 2. Show that (Z, o) is a 
commutative group. 

5. Let G be the set of all rational numbers excepting — 1 . Define a binary operation 
‘o’ on G by a o b = a + b + ab. Show that (G, o) is a group. 

6. Define a binary operation ‘o’ on the set of non-zero rational numbers by a o b = 
\a\b. Is (R*, o) a group? Justify your answer. 

7. Let 

G={(* ^ : a, b, c, d e R (or Z) and ad — be = 1 

Show that G is a group under usual multiplication. Is this group commutative? 

8. (i) A semigroup ( X , o) is said to be cancellative iff for each x, y, z £ X, x o 
y=xoz^y = z and yox = zox^y = z. Show that a finite cancellative 
semigroup ( X , o) is a group. 

Does this result hold if the semigroup is not finite? 

(ii) Show that a cancellative semigroup can be extended to a group iff it is 
commutative. 

9. Let SL(n, R) ( SL(n,C )) be the set of all ft x n (n > 1) unimodular matri- 
ces A (i.e., detA = 1) with real (complex) entries. Show that (SL(n, R), •) 
(( SL(n , C), •)) is a group under usual multiplication. 

10. Let U n (n > 1) denote the set of all n x n complex matrices A such that AA T = 
I . Show that (t/ w , •) is a non-commutative group and is a subgroup of GL(n, C) 
(under usual multiplication). ((t/ w , •) is called unitary group.) 

[Hint. AA t =>► / =>► | det A | 2 = L] Show that in particular, the set of 
all unitary matrices A with detA = 1 is a non-commutative group (known as 
special unitary group). 

11. Let O n (n > 1) denote the set of all orthogonal n x n real matrices A (i.e., 
AA t = I). Show that (O n , •) is a non-commutative group (under usual multi- 
plication). 

12. Let SO n be the special orthogonal group consisting of all orthogonal matri- 
ces A such that det A = 1. Show that ( SO n , •) is a non-commutative group but 
0 SO n , +) is not a group (where ‘-’ and ‘+’ denote usual multiplication and 
addition of matrices respectively). 

[Hint. ( SO n , +) does not contain identity.] 
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13. Let (G, •) be a group and S, T subsets of G. Then define the product S o T as 

{ z £ G : z = xy for some x e S,y eT 
0 if either S or T is empty. 

Then the inverse of the set S denoted by S~ l is defined by 

j zgG:z = x _1 for some xeS 
0 iff 5 = 0. 


Show that if S, T, W are subsets of G, then 

(a) (SoT)o W = So(ro W); 

(b) (S o T) -1 = T -1 o S' -1 ; 

(c) S o T / T o S (in general). 

14. Show that a non-empty subset T of a group (G, •) is a subgroup of G iff T o T c 
T and T~ l c T or equivalently, T o T~ l c T . 

15. Let S, T be subgroups of a group (G, •). Show that S o T is a subgroup of G iff 
SoT = ToS. 

16. Let G be a group and 7\ , T 2 two proper subgroups of G. 

Show that T\ U 72 is a subgroup of G iff 7\ c T 2 or T 2 c 7i. Hence show 
that a group cannot be the union of two proper subgroups. 

17. Let (G, •) be a group and S a non-empty subset of G. 

Show that the subgroup (S) generated by S consists of all elements of the 
form x\X 2 • • • x n , where x\ e S U S~ l for all integers n > 1 and, moreover, if S 
is a countable subset of G, then (S) is countable. 

[Hint. T be the set of all finite products of the given form x\X 2 '"X n . Taking 
n = 1 and allowing x\ run over S, it follows that S c 7\ Suppose a = X1V2 • • • x n 
and b = yiy2 • • -y m e T. Now a/? -1 = V1V2 • • -x w x w+ ix w+ 2 • • -x„ +m , where 
= (jm— i+l) 

But y j G 5* U S -1 =>► y ; -i e^U S' -1 . Consequently, x n +i e S U S~ l => x/ g 
S' U S~ 1 , i = 1, 2, . . . , m + =>► a/? -1 g T =>- T is a subgroup of G. Now S c 

T ^ (5) c f. Since S c (S) and (S) is a subgroup, S _1 c (S). 

S U S _1 c (S). By closure property and induction, it follows that any finite 
product of elements of S U S~ l is also contained in (S) i.e., T c (S).] 

18. Show that (R, +) is not finitely generated (+ denotes usual addition of reals). 

[Hint. If possible, 3 a finite subset S cR such that (S) = R. Then S is 
countable =>► R is countable =>► a contradiction.] 

19. The dihedral group D n \ The group D n is completely determined by two gen- 
erators s, t\ and the defining relations t n = 1, s 2 = 1, ( ts ) 2 = 1, and t k ± 1 for 
0 <k <n, where 1 denotes the identity element. Find |D n |. Construct the Cay- 
ley table for the group D3 . Interpret D n as a group of symmetries of a regular 
n-g on. 

20. The quaternion group H. This group is completely determined by two gen- 
erators s, t\ and the defining relations s 4 = 1, t 2 = s 2 , ts = s 3 t. Construct its 
Cayley table. 
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21. (a) External direct product of groups. Let H and K be groups. Show that 

(i) H x K is a group under binary operation: (h, k)(h ' , k r ) = (hh',kk') 
V/z, h' e H and k,k' e K\ 

(ii) If H ^ {1} and K H x K is neither isomorphic to H nor to K ; 

(hi) H x K is an abelian group iff H and K are both abelian groups. 

(b) Internal direct product of subgroups. Let G be a group and H, K be 
normal subgroups of G such that G = HK and H D K = {1}. Then G is called 
the internal direct product of H and K . If G is an additive abelian group, then 
it is called the internal direct sum of H and K and denoted by G = H © K . 
Prove the following: 

(i) G = H ® K iff every element x e G can be expressed uniquely as r = 
h + k for some h e H and k e K. 

(ii) Let G be the internal direct product of normal subgroups H and K . Then 
G is isomorphic to H x K. 

(iii) If G = H x K, then H and K are isomorphic to suitable subgroups H and 
K of G respectively, such that G is the internal direct product of H and 
K. 

(iv) Let / : G — > H and g : H K be homomorphisms of abelian groups 
such that g o / is an isomorphism. Then H = Im / 0 ker /. 

22. Klein four- group C x C: If C is the cyclic group of order 2 generated by g, show 
by using Ex. 21 that C x C is a group of order 4. Work out the multiplication 
table for C x C. Show that the cyclic group of order 4, C4 = {1, a, a 2 , a 3 }, 
where a 4 = 1 and the group C x C are not isomorphic. 

[Hint. All elements other than (1, 1) of C x C are of order 2 but it is not true 
for C 4 .] 

Remark The Klein 4-group is named after Felix Klein (1849-1925). This group 
may also be defined as the set {1, a, b, c } together with the multiplication de- 
fined by the following table: 


• 

1 

a 

b 

c 

1 

1 

a 

b 

c 

a 

a 

1 

c 

b 

b 

b 

c 

1 

a 

c 

c 

b 

a 

1 


23. (a) Show that a finite group G of order r is cyclic iff it contains an element g of 
order r . 

(b) Let {a) be a finite cyclic group of order n. Show that {a) = {a, a 2 , . . . , 
a n ~ l }. 

24. Show that every element a (^1) in a group of prime order p has period p. 

25. Show that a group in which every element a (^1) has period 2 is abelian. 

26. Show that every group of order <6 is abelian. 
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27. Let G be a non-abelian group of order 10. Prove that G contains at least one 
element of period 5. 

28. If G is a group of order p 2 ( p is prime >1), show that G is abelian (see Ex. 3 
of Exercises-I, Chap. 3). 

29. Let A and B be different subgroups of a group G such that both A and B are of 
prime order p. Show that AH5 = {1}. 

30. Let G be a group of order 27. Prove that G contains a subgroup of order 3. 

31. Let G be a finite group and A, B subgroups of G such that B c A. Show that 
[G : A] is a factor of [G : B]. 

32. Let / be a homomorphism from G to G', where G and G' are finite groups. 
Show that |Im/| divides \G'\ and gcd(|G|, \G'\). 

[Hint. Use Theorem 2.4. 5(i) and Theorem 2.5.1.] 

33. Show that 

(i) two cyclic groups of the same order are isomorphic; 

(ii) every cyclic group of order n is isomorphic to Z n (cf. Theorem 2.4.12); 

(iii) Z 4 is cyclic and both 1 and 3 are generators; 

(iv) Z p has no proper subgroup if p is a prime integer; 

(v) an infinite cycle group has exactly two generators. 

34. Find all the subgroups of Zis and give their lattice diagram. 

35. (a) Let G be a group of order pq , where p and q are primes. Verify that every 
proper subgroup of G is cyclic. 

(b) Prove that finite groups of rotations of the Euclidean plane are cyclic. 

36. Let G be a finite group of order 30. Can G contain a subgroup of order 9? 
Justify your answer. 

37. If the index of a subgroup H in a group G is 2, prove that aH = Ha Va e G 
and G/H is isomorphic to a cyclic group of order 2. 

38. Give an example of a subgroup H in a group G satisfying aH = Ha Va e H 
but [G : H]> 2. 

39. If H is a subgroup of finite index in G, prove that there is only a finite number 
of distinct subgroups of G of the form aHa~ l . 

40. Show that the order of an element of finite group G divides the order of G. 

[Hint. The order of an element a e G is the same as the order of (a) (by 
Theorem 2.4.8) and then apply Lagrange’s Theorem.] 

41. Let G be a group and x e G be such that x is of finite order m. If x r = 1, show 
that m\r. 

42. (i) Let G be a cyclic group of prime order p. Show that G has no proper sub- 
groups. 

(ii) Show that the only non-trivial groups which have no proper subgroups 
are the cyclic groups of prime order p. 

43. Let (G, •), (H, 0 and ( K , •) be groups and / : G H and g : H K be group 
homomorphisms. Show that 

(i) g o f : G — > K is also a homomorphism. Further, if /, g are both 
monomorphisms (epimorphisms), then g o / is also a monomorphism (epi- 
morphism); 
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(ii) in particular, if / and g are both isomorphisms, then g o / is also an iso- 
morphism; 

(iii) further, if / : G —> H is an isomorphism, then f~ l : H G is also an 
isomorphism. 

44. Let P be the set of all polynomials with integral coefficients. Show that (P, +) 
is an abelian group isomorphic to the multiplicative group (Q + , .) of positive 
rationals. 

[Hint. Let { p n }^ 0 an enumeration of positive rationals in increasing 

order. p Q < p\ < p 2 < • • • < p n < ■ • • • 

Define / : P — > Q + as follows: for p(x) =ao + a\x + a 2 X 2 H b a n x n g 

P, f(p(x)) = Pq° p a i' P2 2 ■■■ Pn" eQ + . 

Verify that / is a homomorphism with ker / = {<20 + a\x + • • • + a n G P : 
Po°P\ l * * * Pn n = 1} = {0}. Check that / is an isomorphism.] 

45. A homomorphism of a group into itself is called an endomorphism. Let (C*, •) 
be the multiplicative group of non-zero complex numbers. Define / : (C*, •) — > 
(C*, •) by f(z) = z n (n is a positive integer). Show that / is an endomorphism. 
Find ker/. 

46. Let G be a group. Show that a mapping / : G — > G defined by x x -1 is an 
automorphism of G iff G is abelian. 

47. Let G be a group and a g G. Then the mapping f a :G^G defined by 
f a (x) = a~ l x a (Vjc g X ) is an automorphism of G (called an inner automor- 
phism induced by a) and /] = 1 G is an inner automorphism (called a trivial au- 
tomorphism). Show that for an abelian group G, the only inner automorphism 
is 1 g but for a non-abelian group there exist non-trivial automorphisms. 

48. Show that 

(a) the map / : (R + , •) — > (R, +) defined by f(x) = log e x is an isomorphism; 

(b) the map g : (R, +) (R + , •) defined by g(x) = e x is an isomorphism. 

[Hint. Check that / and g are homomorphisms such that g o / = 1 and 
fog = l]. 

49. Show that every homomorphic image of an abelian group is abelian but the 
converse is not true. 

50. Show that 

(a) a homomorphism from any group to a simple group is either trivial or sur- 
jective; 

(b) a homomorphism from a simple group is either trivial or one-to-one. 

51. Prove the following: 


AutZ 6 = Z 2 and AutZs = Z 4 


(see Theorem 2.3.4). 

52. (a) If G is a cyclic group and / : G — >► G' is a homomorphism of groups, show 
that Im / is cyclic. 

(b) Let G be a group such that every cyclic subgroup of G is normal in G. 
Show that every subgroup of G is a normal subgroup of G. 
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53. If G is a cyclic group and a, b are generators of G. Show that there is an auto- 
morphism of G which maps a onto b and also any automorphism of G maps a 
generator onto a generator. 

54. Let A(S ) be the permutation group on S(^0). If two permutations f,ge A(S ) 
are such that for any x e S, f(x) / v => g(x) = x and for any y e S, g(y) ^ 
y => f(y) = y, (then / and g are called disjoint), examine the validity of the 
equality fg = gf. 

55. Show that every permutation can be expressed as a product of disjoint cycles 
(the number of cycle may be 1), where the component cycles are uniquely de- 
termined except for their order of combination. 

56. Show that a permutation a can be expressed as a product of transpositions (i.e., 
cycles of order 2) in different ways but the number of transpositions in such a 
decomposition is always odd or always even. 

(< o’ is said to be an odd or even permutation according as r is odd or even, 
where r is the number of transpositions in a decomposition.) 

57. Show that in a permutation group G, either all permutations are even or exactly 
half the permutations are even, which form a subgroup of G. 

58. Show that all even permutations of a set of n distinct elements (n > 3) form a 
normal subgroup A n of the symmetric group S n (A n is called the alternating 
group of degree n). 

Find its order (see Ex. 14 of SE-II). 

59. (i) Show that the alternating group A n is generated by the following cycles of 
degree 3: (123), (124), . . . , (12 n). 

(ii) If a normal subgroup H of the alternating group A n contains a cycle of 
degree 3, show that H = A n . 

(iii) Show that the groups S 3 and Z6 are not isomorphic but for every proper 
subgroup G of S 3 , there exists a proper subgroup H of Z 6 such that G = H. 

60. Let G be a group and a, b e G. The element aba~ l b~ l denoted by [< a , b] is 
called a commutator of G. The subgroup of G whose elements are finite prod- 
ucts of commutators of G is called the commutator subgroup of G. Denote this 
subgroup by C(G). Show that 

(i) [a,b][b,a] = l; 

(ii) G is an abelian group iff C(G) = {1}; 

(iii) C(G) is a normal subgroup of G; 

(iv) the quotient group G/C(G) is abelian; 

(v) if H is any normal subgroup of G such that G/H is abelian, then 
C(G)CH ; 

(vi) C(S 3 ) = A 3 . 

61. A group G is called a quaternion group iff G is generated by two elements 
a, b e G satisfying the relations: 

0(a) = 4, a 2 = b 2 and ba = a 3 b (see Ex. 20). 

Let G be a subgroup of the General Linear group GL( 2, C) of order 2 over C, 
generated by A = ( _° x *) and B = Show that G is a quaternion group. 
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62. Prove the following: 

(a) A semigroup S is a semilattice of groups iff S is a regular semigroup, all of 
whose one sided ideals are two sided. 

(b) A semigroup S is a semilattice of left groups iff S is a regular semigroup, 
all of whose left ideals are two sided. 

(c) (i) A semigroup S is a semilattice of left groups iff the condition I D L D 

R = RL I holds for any two-sided ideal /, every ideal left L and right 
ideal R of S; 

(ii) a semigroup S is a semilattice of right groups iff the condition I D 
L n R = I RL holds for every two-sided ideal /, every left ideal L and 
every right ideal R of S. 

63. A Clifford semigroup is a regular semigroup S in which the idempotents are 
central i.e., in which ex = xe for every idempotent e e S and every x e S. 

Prove that a semigroup S is a Clifford semigroup iff S is a semilattice of 
groups. 

64. Prove the following: 

(i) Finite groups of rotations of Euclidean plane are cyclic; any such group of 
order n consists of all rotations about a fixed point through angles of 2^ 
for r = 0, 1 , 2, . . . , n — 1 . 

(ii) The cyclic and dihedral groups and the groups of tetrahedron, octahedron 
and icosahedron are all the finite subgroups of the group of rotations of 
Euclidean 3-dimensional space. 

The latter three groups are called the tetrahedral group of 12 rotations 
carrying a regular tetrahedron to itself, the octahedral group of order 24 of 
rotations of a regular octahedron and the icosahedral group of 60 rotations 
of a regular icosahedron, respectively. 

65. Series of Subgroups. A subnormal (or subinvariant) series of subgroups of a 
group G is a finite sequence {Hi} of subgroups of G such that 


{1} = H 0 cHiC...ch„ = G 


and Hf is a normal subgroup of Hi+\ for each i = 0, l, . . . ,n — 1. 

In addition, if each Hi is a normal subgroup of G, then {Hi} is called a 
normal (or invariant series ) series of G: 

Two subnormal (normal) series {Hi} and {Tj} of the same group G are 
isomorphic iff 3 a one-to-one correspondence between the factor groups 
{Hi+i/Hi} and {Tj+\/Tj} such that the corresponding factor groups are iso- 
morphic. 

A subnormal series (normal series) {Tj} is a refinement of a subnormal series 
(normal series) { H { } of a group G iff {Hi} c {Tj}, i.e., iff each Hi is one of the 

Tj- 

A subnormal series {Tj} of a group G is a composition series iff all the 
factors groups Tj+\/Tj are simple. A normal series {Ni} of G is a principal 
series iff all the factor groups Ni+\/Ni are simple. 
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A group G is solvable iff it has a composition series {Tj} such that all factor 
groups Tj+i/Tj are abelian. 

Prove the following: 

(a) (i) Schreier’s Theorem'. Two subnormal (normal) series of a group G have 

isomorphic refinements. 

(ii) Z has neither a composition series nor a principal series. 

(iii) Jordan-Holder’s Theorem. Any two composition (principal) series of 
group G are isomorphic. 

(iv) If G has a composition (principal) series, and if A is a proper normal 
subgroup of G, then 3 a composition (principal) series containing A. 

(v) A subgroup of a solvable group and the homomorphic image of a solv- 
able group are solvable. 

(vi) If G is a group and A is a normal subgroup of G such that both A and 
G/A are solvable, then G is solvable. 

(b) Let G be a group and A a pseudo-ideal of G. Then G is solvable A is 
solvable. 

66. An abelian group A is said to an injective group iff wherever i : G r C G (G r is 
a subgroup of G), for any homomorphism / : G’ -> A, of abelian groups 3 a 
homomorphism / : G -> A such that / o i = f and A is said to be a projective 
group iff for any epimorphism a : G -> G" of abelian groups, and any homo- 
morphism / : A — ► G" , 3 a homomorphism / : A — > G such that a o f = f. 
An abelian group A is said to be a divisible group iff for any element a e A, 
and any non-zero integer n, 3 an element b e A such that nb = a. Show that 

(i) an abelian group A is injective A is divisible; 

(ii) any abelian group is a subgroup of an injective group. 


2.12 Exercises (Objective Type) 

Exercises A Identify the correct alternative^ ) ( there may be more than one) from 

the following list'. 

1. Let G be a group and a,b e G such that 0(a) =4, 0(b) = 2 and a 3 b = ba. 
Then 0(ab) is 

(a) 2 (b) 3 (c) 4 (d) 8. 

2. Let G be a cyclic group of infinite order. Then the number of elements of finite 
order in G is 

(a) 1 (b) 2 (c) infinitely many (d) 4. 

3. Let G be an infinite cyclic group. Then the number of generators of G is 

(a) 1 (b) 3 (c) 2 (d) infinitely many. 

4. The number of subgroups of S 3 is 

(a) 4 (b) 5 (c) 6 (d) 3. 

5. The order of the permutation (1 2 4) (3 5 6) in S 6 is 

(a) 2 (b) 4 (c) 3 (d) 9. 
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6. Let G be a finite group and H , K be two finite subgroups of G such that \ H\=1 
and \K\ = 11. Then \H HK\ is 

(a) 1 (b) 7 (c) 11 (d) 77. 

7. Let G be a finite group and H , K be two subgroups of G such that \H\ =7, 
\K\ = 11. Then \HK\ is 

(a) 77 (b) 1 (c) 7 (d) 11. 

8. Let G be a finite group of order <400. If G has subgroups of order 45 and 75, 
then \G\ is 

(a) 400 (b) 225 (c) 300 (d) 75. 

9. Let G be a finite group having two subgroups H and K and such that \H\ =45, 
\K | = 75. If |G| is /i, where 225 < n < 500, then n is given by 

(a) 230 (b) 475 (c) 450 (d) 460. 

10. Let G be a cyclic group of order 25. Then the number of elements of order 5 in 
G is 

(a) 1 (b) 3 (c) 4 (d) 20. 

11. Let G be a non-trivial group of order n. Then G has 

(a) always an element of order 5; 

(b) no element of order 5; 

(c) an element of order 5 if 5 divides n ; 

(d) an element of order 30 if 30 divides n. 

12. The number of elements of order 5 in the group Z25 x z 5 is 

(a) 1 (b) 16 (c) 24 (d) 20. 

13. The number of subgroups of order 5 in the group Z5 x Z9 is 

(a) 1 (b) 2 (c) 24 (d) 5. 

14. Automorphism group Aut(Z5) of Z5 is isomorphic to 

(a) Z 5 (b)Z 3 (c)Z 4 (d)Z 2 . 

15. Let p be a prime integer. Then |Aut(Zp)| is 

(a) p (b) p - 1 (c) 1 (d) p(p - 1). 

16. Let / : Z — > Z be a group homomorphism such that /( 1) = 1. Then 

(a) / is an isomorphism; 

(b) / is a monomorphism but not an isomorphism; 

(c) / is not a monomorphism; 

(d) / is not an epimorphism. 

17. Let G be an arbitrary group and / : G -> G be an epimorphism. Then 

(a) / is always an isomorphism; 

(b) / is not always an isomorphism; 

(c) / is not always a monomorphism; 

(d) ker / contains more than one element. 

18. The number of subgroups of 4Z/16Z is 

(a) 5 (b) 3 (c) 4 (d) 2. 

19. The number of subgroups of (Z, +) which contains 10Z as a subgroup is 

(a) 4 (b) 10 (c) infinite (d) 3. 
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20. Consider the groups G = Z 2 x Z 3 , and K = Z^. Then 

(a) 3 no homomorphism from G onto K ; 

(b) 3 no monomorphism from G onto K ; 

(c) 3 an isomorphism from G onto K ; 

(d) 3 an epimorphism from G onto K. 

21. Let A n be the set of all even permutations of the symmetric group S n . Then 

(a) A n is a subgroup of S n but not normal; 

(b) A n is not a subgroup of S n ; 

(c) A n is a normal subgroup of S n ; 

(d) A n is always a non-commutative subgroup of . 

22. The group Z 3 x Z 3 is 

(a) cyclic; 

(b) not isomorphic to Z 9 ; 

(c) isomorphic to Z 9 ; 

(d) simple. 

23. The group Z 2 x Z 2 is 

(a) cyclic; 

(b) isomorphic to Klein’s 4-group; 

(c) not isomorphic to Klein 4-group; 

(d) simple. 

24. The group Zs x Z 9 is 

(a) not cyclic; 

(b) isomorphic to Zs x Z 3 X Z 3 ; 

(c) isomorphic to Z 72 ; 

(d) simple. 

25. The number of homomorphisms from the group (Q, +) into the group (Q + , •) 
is 

(a) one (b) two (c) infinitely many (d) zero. 

26. The number of generators of the group Z p xZ q , where p, q are relatively prime 
to each other, is 

(a) pq (b) (p - 1 )(q - 1 ) (c) p(q - 1 ) (d) q(p - 1 ). 

27. Given a group G, let Z(G) be its center. For n e N + (the set of positive integers) 
define G n = {(gi, ...,g n )e Z(G) x • • • x Z(G) : g\ • • • g n = 1}. As a subset of 
the direct product group G x • • • x G (n times direct product of the group G), 
G n is 


(a) isomorphic to the direct product Z(G) x • • • x Z(G) (( n 

(b) a subgroup but not necessarily a normal subgroup; 

(c) a normal subgroup; 

(d) not necessarily a subgroup. 


1 ) times); 
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28. Let G be a group of order 77. Then its center Z(G) is isomorphic to 

(a) Z 7 (b)Zi (c)Zn (d) Z 77 . 

29. The number of group homomorphisms from the symmetric group £3 to the 
group Z 6 is 

(a) 2 (b) 3 (c) 6 (d)l. 

30. Let G be the group given by G = Q/Z and n be a positive integer. Then the 
statement that there exists a cyclic subgroup of G of order n is 

(a) never true; 

(b) not necessarily true; 

(c) true but not necessarily a unique one; 

(d) true but a unique one. 

31. Let H = {e, (1 2) (3 4), (1 3) (2 4), (1 4) (2 3)} and K = {e, (1 2) (3 4)} be sub- 
groups of 5 * 4 , where e denote the identity element of £ 4 . Then 

(a) H and K are both normal subgroups of £ 4 ; 

(b) K is normal in A 4 but A 4 is not normal in £ 4 ; 

(c) H is normal in £ 4 , but K is not; 

(d) K is normal in H and H is normal in A 4 . 

32. Let n be the orders of permutations a of 11 symbols such that o does not fix 
any symbol. Then n is 

(a) 18 (b) 15 (c) 28 (d) 30. 

33. Which of the following groups is (are) cyclic? 

(a) Zg 0 Zg (b) Zg 0 Z 9 (c) Zg 0 Z 10 (d) a group of prime order. 

34. Let G be a finite group and H be a subgroup of G. Let 0(G) = m and 
0(H) = n. Which of the following statements is (are) correct? 

(a) If m / n is a prime number, then H is normal in G ; 

(b) if m = 2n, then H is normal in G; 

(c) if there exist normal subgroups A and B of G such that H = {ab \ a e 
A,b e B}, then H is normal in G; 

(d) if [G : H] = 2, then H is normal in G. 

35. Let G be a finite abelian group of odd order. Which of the following maps 
define an automorphism of G? 

(a) The map v t-^ x~ l for all v e G; 

(b) the map x \-^ x 2 for all v e G; 

(c) the map v t-^ x for all v e G ; 

(d) the map x t-^ x~ 2 for all xgG. 

36. Let GL(n, R) denote the group of all n x n matrices with real entries (with 
respect to matrix multiplication) which are invertible. Which of the following 
subgroups of GL(n , R) is (are) normal? 

(a) The subgroup of all real orthogonal matrices; 

(b) the subgroup of all matrices whose trace is zero; 
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(c) the subgroup of all invertible diagonal matrices; 

(d) the subgroup of all matrices with determinant equal to unity. 

37. Which of the following subgroups of GL( 2, C) is (are) abelian? 

(a) The subgroup of invertible upper triangular matrices; 

(b) the subgroup A defined byA = { [ _^ ] :a,b e R and \a \ 2 + \b \ 2 = l}; 

(c) the subgroup G = {M e GL 2 (C ) : detM = 1}; 

(d) the subgroup B defined by B = {[_^ b -\ : a, b e C and \a\ 2 + \b\ 2 = l}. 

38. Let / : (Q, +) — > (Q, +) be a non-zero homomorphism. Then 

(a) / is always bijective; 

(b) / is always surjective; 

(c) / is always injective; 

(d) / is not necessarily injective or surjective. 


39. Consider the element 


/I 2 3 4 
\2 1 4 5 


5 

3 


of the symmetric group S 5 on five elements. Then 

(a) a is conjugate to ( 4 $ ^ \ \ \ ) ; 

(b) the order of a is 5; 

(c) a is the product of two cycles; 

(d) a commutes with all elements of S 5 . 


40. Let G be a group of order 60. Then 

(a) G has a subgroup of order 30; 

(b) G is abelian; 

(c) G has subgroups of order 2, 3, and 5; 

(d) G has subgroups of order 6, 10, and 15. 


41. Let G be an abelian group of order n. Then 

(a) n = 36 (b)n = 65 (c)n = \5 (d)n = 2l. 

42. Which of the following subgroups are necessarily normal subgroups? 


(a) The kernel of a group homomorphism; 

(b) the center of a group; 

(c) a subgroup of a commutative group; 

(d) the subgroup consisting of all matrices with positive determinant in the 
group of all invertible n x n matrices with real entries (under matrix multi- 
plication). 


43. Which of the given subgroup H of a group G is a normal subgroup of G? 


(a) G is the group of all 2 x 2 invertible upper matrices with real entries under 
matrix multiplication and H is the subgroup of all such matrices (aij) such 
that a\\ = 1; 
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(b) G is the group of all n x n invertible matrices with real entries under matrix 
multiplication and H is the subgroup of such matrices with determinant 1 ; 

(c) G is the group of all 2 x 2 invertible upper matrices with real entries under 
matrix multiplication and H is the subgroup of all such matrices (<z/y) such 
that an =< 222 ; 

(d) G is the group of all n x n invertible matrices with real entries under ma- 
trix multiplication and H is the subgroup of such matrices with positive 
determinant. 

44. Let S'! denote the symmetric group of all permutations of the symbols 
{1,2, 3, 4, 5, 6 , 7}. Then 

(a) S'] has an element of order 10 ; 

(b) S'] has an element of order 7 ; 

(c) S'] has an element of order 15; 

(d) the order of any element of S' 7 is at most 12. 

Exercises B (True/False Statements) Determine which of the following state- 
ments are true or false. Justify your answers with a proof or give counter-examples 
accordingly. 

1. The set of all Mobius transformations S : C -> C, z i-> , ad — be / 0; 

a, b, c, d e C form a group under usual composition of maps. 

2. The set SL( 2, R) = {X e GL( 2, R) : detX = 1} is a normal subgroup of the 
group GL( 2, R) of all real non- singular matrices of order 2. 

3. The set G = {x e Q : 0 < x < 1} is a group under the usual multiplication. 

4. If each element of a group G, except its identity element is of order 2, then the 
group is abelian. 

5. A cyclic group can have more than one generator. 

6 . Let G be a group of infinite order. Then all the elements of G, which are of the 
form a n , n e Z, for a given a ^ 1, are distinct. 

7. The octic group D 4 is cyclic. 

8 . If every proper subgroup of group G is cyclic, then G is also cyclic. 

9. Let G be a cyclic group of order n. Then G has 0(n) distinct generators. 

10. If G is an abelian group, then the only inner automorphism of G is the identity 
automorphism. 

1 1 . Let G be the internal direct product of its normal subgroups H and K . Then the 
groups G/H and K are isomorphic. 

12. If g and g 2 are both generators of a cyclic group G of order n , then n is prime. 

13. Let G be a finite group of odd order. If H = {x 2 : x e G}, then H is a subgroup 
of G only if G is cyclic. 

14. Two permutations in S n are conjugate to each other they have the same cycle 
decomposition. 

15. Z(A 4 ), the center of the alternating group A 4 is {e}. 

16. Let H be a normal subgroup of a group G. If H and G/H are both cyclic, then 
G is cyclic. 

17. Every group of order 4 is commutative. 
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18. Every finitely generated group G can be expressed as the union of an ascending 
sequence of proper subgroups of G. 

19. Let G be a finite group and H be a proper subgroup of G such that x,y e G 
and x,y$H^xyeH. Then \G\ is an even integer. 

20. The group (Q, +) is cyclic. 

21. The group (R, +) is cyclic. 

22. Any finitely generated subgroup of (Q, +) is cyclic. 

23. In the group (Q, +), there exists an ascending sequence of cyclic groups G n 
such that Q = |J G n . 

24. Let ( H , •) be a subgroup of (C*, •) such that [C* : H] = n (finite). Then 
hg c*. 

25. Every finite subgroup ( H , •) of (C*, •) is cyclic. 

26. The groups (Z, +) and (R, +) are isomorphic. 

27. The groups (R*, •) and (R, +) are isomorphic. 

28. The groups (Z, +) and (Q, +) are isomorphic. 

29. Every proper subgroup of a non-cyclic abelian group is non-cyclic. 

30. A group G cannot be isomorphic to a proper subgroup H of G. 

31. A group G is commutative ( ab) n = a n b n , Wa,b e G and for any three con- 

secutive integers n. 

32. Zg is a homomorphic image of Z 3 x Z 3 . 

33. The groups Z 6 x Zs and Z 48 are isomorphic. 

34. Let G be a group. For a fixed a e G, a mapping \j/ a : G — > G is defined by 
i^a(x) = ax, Vv e G. Then \j/ a is a permutation of G. 


2.13 Additional Reading 

We refer the reader to the books (Adhikari and Adhikari 2003, 2004; Artin 1991; 
Birkoff and Mac Lane 1965; Chatterjee 1964; Herstein 1964; Howie 1976; Hunger- 
ford 1974; Jacobson 1974, 1980; Lang 1965; Mac Lane 1997; Malik et al. 1997; 
Shafarevich 1997; van der Waerden 1970) for further details. 
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Chapter 3 

Actions of Groups, Topological Groups 
and Semigroups 


The actions of groups, topological groups and semigroups are very important con- 
cepts. An action of a group (semigroup) on a non-empty set assigns to each element 
of the group (semigroup) a transformation of the set in such a way that it assigns 
to the product of the two elements of the group (semigroup) the product of the two 
corresponding transformations. As a consequence, each element of a group deter- 
mines a permutation on the set under a group action. For a topological group action 
on a topological space, this permutation is a homeomorphism and for a Lie group 
action on a differentiable manifold, it is a diffeomorphism. Group actions are ap- 
plied to develop the theory of finite groups. More precisely, group actions are used 
to determine the number of distinct conjugate classes of a subgroup of a finite group 
and also to prove Cayley’s Theorem, Cauchy’s Theorem and Sylow’s Theorems for 
groups. The counting principle is applied to determine the structure of a group of 
prime power order. These groups arise in the Sylow Theorems and in the descrip- 
tion of finite abelian groups. The Sylow Theorems give the existence of p - Sylow 
subgroups of a finite group and describe the subgroups of prime power order of an 
arbitrary finite group. We discuss the converse of Lagrange’s theorem which is not 
true for arbitrary finite groups. We also study actions of topological groups and Lie 
groups and obtain some orbit spaces which are very important in topology and ge- 
ometry. For example, R P n , C P n are obtained as orbit spaces. On the other hand, in 
this chapter, semigroup actions are applied to theoretical computer science yielding 
state machines, which unify computer science with mainstream mathematics. 


3.1 Actions of Groups 

In this section we introduce the concept of an action of a group G on a non-empty 
set X and study such actions. We show that a group action of G on X assigns to 
each element of G a permutation on the set X. This leads to a proof of Cayley’s 
theorem that every abstract group is isomorphic to some group of permutations and, 
in particular, every group of order n is isomorphic to a certain subgroup of S n . This 
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interesting theorem of Cayley establishes the equivalence between the historical 
concept of a group of transformations and the modern axiomatic concept of a group. 

Definition 3.1.1 A group G is said to act on a non-empty set X from the left (or 
X is said to be a left G-set) iff there is a function o : G x X -> X, denoted by 
(g, x) i-> g • v (or gx) for all v e X, g e G such that 

(i) for any x e X, 1 • x = x, where 1 is the identity of G; 

(ii) for any xeX,g 1 ,g 2 eG, (gig 2 ) • x - gi ■ (g 2 ■ x). 

a is said to be an action of G on X from the left and the ordered triple (X, G, a) is 
called a transformation group. 

Example 3.1.1 Let G be a subgroup of S n . Then an action of G on the set X = 
{1,2 , ... ,n] is defined by a • x = a(x) Va e G i.e., a • v is the effect of applying 
the permutation a e G to the element x e X. 

Example 3.1.2 Let X denote the Euclidean 3 -space and G the group of all rotations 
of X which leave the origin fixed. Then an action of G on X is defined by g • v = 
g(x ), the image of the point x e X under the rotation g. 

There may exist different actions of a group on a given set X. 

Example 3.1.3 Let H be a subgroup of a group G. Then for all h e H and x e G, 
each of the following is a group action: 

(i) h • x = hx ( left translation ); 

(ii) h ' x — xh~ l \ 

(iii) h • x = hxh~ l (< conjugation by h); 

Each of the above is an action of H on G. The element hxh~ l is said to be a 
conjugate of v. 

(iv) if H is a normal subgroup of G, then x • h = xhx~ l is an action of G on //, 
where the right hand multiplication is the group multiplication. 

Remark A right action of a group G on a set X is defined in an analogous manner. 

Definition 3.1.2 A group G is said to act on a set X from the right (or X is said to 

be a right G-set) iff there is a function a : X x G — > X, denoted by (x, g) \-^ x • g 

(or xg) for all x e X, g e G such that 

(i) ’ for any x e X, x • 1 = x\ 

(ii) ’ for any x € X, g u g 2 € G, x • (gig2) = (x • gi) • g 2 . 

Remark The essential difference between the right and left G-sets is not whether 
the elements of G are written on the right or left of those of X. The main point is 
the difference between respective conditions (ii) and (ii)’: If A is a left G-set, then 



3 . 1 Actions of Groups 


139 


the product gig 2 acts on x e A in such a way that g 2 operates first and g\ operates 
on the result, but for the right G-sets, g\ operates first and g 2 operates on the result. 

If A is a left G-set, then for any x g A and g e G, x • g = g _1 • x defines a right 
G-set structure. Conversely, if A is a right G-set, then g • x = x • g _1 defines a 
left G-set structure. Since there is a bijective correspondence between left and right 
G-set structures, we need to study only one of them. 

Theorem 3.1.1 Let X be a left G-set. For any g g G, the map X —> X defined by 
xh g - x is a permutation on the set X. 

Proof Let fi g : X -> X be the map defined by 4> g (x) = g • x. Then 

(j) g - 1 (x) = g _1 • x for all g g G and x g A 

=>• ((t>g<t>g-i)(x) = </>g(g~ 1 -x)=g- (g _1 -x) = (gg _1 ) - x = l- x=x 

Vg g G and VxgI =>► 1 = lx- 

Similarly f g -i(p g = lx- Therefore fi g is a bijection on X. Consequently, fi g is a 
permutation on X. □ 

Remark This theorem shows that the notion of a left G-set X is equivalent to the 
notion of a representation of G by permutations on the set X. 

Theorem 3.1.2 Let X be a left G-set. Then 

(i) the relation p on X defined by xpy , iff gx = y for some g G G is an equivalence 
relation ; 

(ii) for each x g X , G x = {g e G : gx = x} is subgroup of G. 

Proof (i) lx = x Vx g X =>- xpx Vx e X =>► p is reflexive. Again xpy =>- gx = y 
for some g e G =>► x = g -1 y ypx =>- p is symmetric. Finally, xpy and ypz =>- 
gx = y and g ; y = z for some g, g' e G =>► (g ; g)x = g r (gx) = g r y = z =>► xpz 
(since g'g e G) =>- p is transitive. Consequently, p is an equivalence relation on A. 

(ii) 1 g G x => G x 7 ^ 0. Let g,h e G x . Then gx = x and hx = x Vx e X. Now 
(g/z _1 )x = g(h~ l x) = gx = x =>► g/i — 1 g G x =>- G x is a subgroup of G. □ 

Definition 3.1.3 Let A be a left G-set. Then the equivalence classes x p of Theo- 
rem 3.1.2(i) are called the orbits of G on A and each x p is called the orbit of x e A, 
denoted by orb(x). G is said to act on A transitively iff orb(x) = A for every x g A. 
Two orbits of G on A are identical or disjoint and the set of all distinct orbits on A 
is called the orbit set denoted A mod G. 

Definition 3.1.4 The subgroup G x is called the isotropy group of x or the stabilizer 
of x. 

Clearly, the subgroup G x fixes every x G A. If G x = {1}, Vx G A, then the action 
of G on A is said to be free. 
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Lemma 3.1.1 If a group G acts on a set X , then there is a bijection between the set 
of cosets of the isotropy group G x in G onto the orb {x) for each x e X. 

Proof Let {gG x } denote the set of all left cosets of G x in G. Consider the map 
/ : {gG x } -* orb(x) given by f(gG x ) = gx. Let g, g' e G. Then gx = g'x 
g~ x g'x = x <£► g _1 g' eG, ■b g'G x = gG x =$■ f is well defined. Clearly, / is a 
bijection. □ 

Theorem 3.1.3 Let a group G act on a set X. Then |orb(x)| is the index. [G : G x ] 
for every x e X. In particular ; if G acts transitively on X , then \X\ = [G : G x ]. 

Proof Since [G : G x ] is the cardinal number of the set {gG x }, the theorem follows 
from Lemma 3.1.1. □ 

Theorem 3.1.4 Let X be a set and A(X) the group of all permutations on X. If a 
group G acts on X , then this action induces a homomorphism G —> A(X). 

Proof For each g e G, the map f g : X X defined by f g (x) = gx is a bijec- 
tion by Theorem 3.1.1. Consider the map x// : G — > A(X) defined by jr(g) = f g . 
Then Vg,g' e G, VKggO = 4> gg '- But <t>( gg ’)(x) = (gg')x = g(g'x) = </> g (g'x ) = 
(0g0 g ')(x) Wx e X =>• cpgg’ = (p g <pg>. Thus, ir(gg') = i/f(g)i/f(g / ) ^ is a homo- 

morphism. □ 

Lemma 3.1.2 For every group G, there is a monomorphism G —> A(G). 

Proof By Example 3.1.3, we find that every group G acts on itself by left trans- 
lation. Then by Theorem 3.1.4, this action induces a homomorphism : G — >► 
MG) given by i fr(g) = <p g , where </> g (g') = gg' Vg, g' e G. Now f(g) = 1 A (G) 
(identity automorphism of G) gg' = g' Vg' eG=^g=l=^i/fisa monomor- 
phism. □ 

The following theorem, essentially due to the British mathematician Arthur Cay- 
ley (1821-1895), is a direct consequence of Lemma 3.1.2. He observed that every 
group could be realized as a subgroup of a permutation group. 

Theorem 3.1.5 (Cayley’s Theorem) Every group is isomorphic to a group of per- 
mutations. In particular every group of order n is isomorphic to a certain subgroup 
ofS n . 

Proof Let G be any group. Then by Lemma 3.1.2, there is a monomorphism f : 
G A(G). Consequently, G = Im x\r . This implies that G is isomorphic to a group 
of permutations. If G is, in particular, a finite group of order n , then A(G) becomes 
the symmetric group S n . Hence the last statement follows from the first one. □ 

Corollary Let G be a group. Then 

(i) for each g e G, conjugation by g induces an automorphism of G; 
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(ii) there is a homomorphism G —> AutG, whose kernel is C(G) = {g e G : gx = 
xg Vx e G}. 

Proof (i) Let G act on itself by conjugation. Then for each g e G, consider the map 
g g \ G — > G defined by conjugation: crg(x) = gxg -1 Vr e G. Now (crgcr g -i)(jc) = 
cr g (g -1 xg) = g(g _1 xg)g _1 = i Vr e G 4 a g a g~ l ~ 1g> where 1 g is the identity 
mapping on G. Similarly, <j g -i cr g = 1 g- Therefore is a bijection on G. Moreover 
Vx, yeG, cr^Cry) = g(xy)g _1 = gxg _1 m _1 = <7g(x)<7g(;y) =>• cr g is a homomor- 
phism. Consequently, for each g, a g is an automorphism of G. 

(ii) Let G act on itself by conjugation. Then by Theorem 3.1.4, this action in- 
duces a homomorphism x/r : G -> A(G) given by \[r(g) = G g (defined in (i)). Now 
g € ker xfi O o g = 1 g O g*g~ l = xVxeGOgx=xgOge C(G) => ker^ = 
C(G). □ 

Remark o g is called the inner automorphism induced by g and the normal subgroup 
keri/f = C(G) is the center of G. The group of symmetries of the square has four 
distinct inner automorphisms but the cyclic group of order 3 has no inner automor- 
phism except the identity. 

Lemma 3.1.3 Let H he a subgroup of a group G and let G act on the set X of all left 
cosets of H in G by left translation. Then the kernel of the induced homomorphism 
G —> A (A) is contained in H . 

Proof The induced homomorphism x/s : G — > A(X ) is defined by x//(g) = o g , where 
o g : X X is given by OnixH) = gxH. Now c e ker xlr => o 2 — lx => gxH = xH 
Vx g G. Then for jc = 1, gl/f = !H = H^geH. □ 

Theorem 3.1.6 Let H be a subgroup of index n of a group G and no non-trivial 
normal subgroup of G is contained in H . Then G is isomorphic to a subgroup of S n . 

Proof Clearly, H is not a normal subgroup of G. Now by using the notation of 
Lemma 3.1.3, the ker \j/ is a normal subgroup of G contained in H and hence by 
hypothesis ker ^ = {1}. This shows that xfs is a monomorphism. Consequently, G is 
isomorphic to a subgroup of S n . □ 

Corollary Let G be a finite group and p be the smallest prime dividing | G | . If H 
is a subgroup of G such [G : H] = p , then H is normal in G. 

Proof Let X be the set of all left cosets of H in G. Hence [G : H] = p =>► 
A(X) = S p . Now consider the map 

x/r : G — >► S p as defined in Lemma 3.1.3. 

As ker^ is a normal subgroup in G and contained in H by Lemma 3.1.3 and 
G / ker x/s is isomorphic to a subgroup of S p , | G / ker xj/ \ divides 1 5^ | = p ! . But every 
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divisor of |G/ kernel = [G : keri/f] must divide \G\ = |kert/^|[G : ker^]. Since no 
positive integer less than p (other than 1) can divide | G | , | G / ker ifr | must be p or 1 . 
Then |G/kert/r| = [G : H][H : kert/^] = /?[/f : keri/f] > /? =>► |G/ker^| = p and 
[/f : kert/^] = 1. Consequently, H = keri/f =>► is normal in G. □ 

Definition 3.1.5 Let X and Y be two left G-sets. Then a mapping / : X -> Y is 
called G-equivariant or G -homomorphism or simply a mapping of left G-sets iff 
/(g • v) = g • /(x) Vv e X and g e G A G-equivariant map / : X -> 7 is called 
an isomorphism of left G-sets iff there exists another G-equivariant map h:Y — > X 
such that ho f = lx and / o /* = ly . As usual, an automorphism of a G-set is a self 
isomorphism. 

Definition 3.1.6 Let A be a left G-set. We say that A is a homogeneous left G-set 
iff for any elements x, y e X, 3g e G such that g • x = y, i.e., iff, G acts transitively 
on X. 

Example 3.1.4 Let G be group and H an arbitrary subgroup of G. Define a map 
xj/ : G x G/H ->► G/H , (g, g 7 //) t-^ gg r //. Then \j/ defines an action of G on G/H 
such that G/H is a homogeneous left G-set. 

Theorem 3.1.7 A/ry homogeneous left G-set is isomorphic to some homogeneous 
left G-set G/H. 

Proof Let X be an arbitrary homogeneous left G-set. Choose an element vo E X. 
Then H = {g e G : gx o = vo} is a subgroup of G. Consider the map / : G — >► X 
defined by g i-> gvo. As A is a homogeneous G-set, / is onto. Now for g, /z e G, 
gxo = /*vo ^ ^ _1 g*0 =xo O h~ l g e H O h, g e same coset of H. Consequently, 
/ induces a map / : G/H — > A, which is clearly a bijection. Moreover, / is G- 
equivariant. Consequently, G//f and A are isomorphic left G-sets. □ 

Remark The isomorphism / and the subgroup H defined above depend on the 
choice of the point xo e X. H is called the isotropy subgroup corresponding to 
jcq. A different choice of vq gives rise to a conjugate subgroup. 


3.2 Group Actions to Counting and Sy low’s Theorems 

The counting principle and Sylow’s Theorems play a basic role in understanding the 
structure of finite groups. We apply the results of G-sets of Sect. 3.1 in counting and 
prescribe a method for determining the number of distinct orbits in a G-set. While 
studying finite groups, it is a natural question: does the converse of Lagrange’s The- 
orem hold for arbitrary finite groups? This means that if a positive integer m di- 
vides the order of a finite group G, does G have a subgroup of order ml If m is 
a prime integer, the answer is shown to be positive by Cauchy’s Theorem. But the 
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alternating group A 4 of order 12 has no subgroup of order 6 (see Ex. 4 of SE-I of 
Chap. 2). This example shows that the converse of Lagrange’s Theorem is not true 
for arbitrary finite groups. Perhaps the best partial converse (not true for all cases) 
is the First Sylow Theorem which states that the answer is positive whenever m is 
a power of a prime integer. The Second and Third Sylow Theorems come as a nat- 
ural consequence of subgroups of maximal prime power order. In this section we 
begin with the counting principle and prove Cauchy’s Theorem and Sylow’s Theo- 
rems. 

We first present some interesting group actions to counting. 

Define X g = {x e X : gx = x} for each g e G and G x = {g e G : gx = x] for 
each x e X, where G is a group and A is a G-set. 

Theorem 3.2.1 (Burnside) Let G be a finite group and X a finite G-set. If r is the 
number of orbits ofG on X , then 


r|G| = £lX*l- 

geG 


(3.1) 


Proof Consider all ordered pairs (g,x), where gx = x, and let N be the number of 
such pairs. Now, for each g e G, there exist \X g | ordered pairs having g as the first 
coordinate. Consequently, 

n = J2 1**1- (3.2) 

geG 

Again for each x e X, there exist | G x \ [see Theorem 3.1.2(h)] ordered pairs (g, x) 
having x as the second coordinate. Consequently, N = Jf xeX \ &x\- 

Then by Theorem 3.1.3, |orb(jt)| = [G:G X ]. Since [G : G x ] = |G|/|G*|, it fol- 
lows that |orb(jt)| = |G|/|G X |. So 


A |orb(x)| 


IGlE 


1 

|orb(x)[ * 


Let L2 be any orbit. Then 


E 

xe£2 


1 

|orb(x)| 


= 1 , 


since l/|orb(x) | has the same value for all v in the same orbit. 
Consequently, 


N = | G | • (number of orbits of G on X) 

= |G|-r^r|G| = ElV4 

geG D 


We now find an equation that counts the number of elements of a finite G-set. 
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Let A be a finite G-set and r the number of orbits of G on X and {x \ , X 2 , . . . , x r } 
the representative system of the orbits, i.e., the set containing one element from each 
orbit of G on X. As every element of X is in exactly one orbit, 

r 

|X| = £|°tb(*i)|. (3.3) 

i = 1 

Let Xg = {x e X : gx = x Vg e G} i.e., Xg is precisely the union of the one element 
orbits of G on X. If | X G | = t, then (3.3) is reduced to 

r 

\X\ = \X G \+ J2 l orb fe)|- (3-4) 

i —t 1 

We shall come back a little later to (3.4) to deduce an important equation known as 
class equation. Before that we need another important notion, known as /7-groups. 


3.2.1 p-Groups and Cauchy’s Theorem 

The class equation can be effectively utilized when the order of a finite group is a 
power of a prime. In the rest of the section p denotes a prime integer. 

Definition 3.2.1 (/7-groups) A finite group G is said to be a prime power group or 
a p- group iff the order of every non-identity element in G is a positive power of the 
prime p. A subgroup H of a group G is a p- subgroup of G iff H is itself a p -group. 

Theorem 3.2.2 Let G be a group of order p n for some prime integer p and positive 
integer n. IfXa finite G-set , then \X\ = \X G \ ( mod p). 

Proof Since |orb(x;)| = [G : G Xj ] by Theorem 3.1.3 and [G : G Xj ] divides \G\, so, 
p divides [G : G Xi ] and thus p divides |orb(jq-)| for t + 1 < i < r (in the notation 
of equation (3.4)). This shows that \X\ — \X G \ is divisible by p and hence \X\ = 
\X G \ (mod p). □ 

We now prove Cauchy’s Theorem which is essentially due to A.L. Cauchy 
(1789-1857). This theorem and its Corollary 2 show two different situations (other 
than condition of the First Sylow Theorem) under which the converse of Lagrange’s 
Theorem becomes also true. 

Theorem 3.2.3 (Cauchy) Let G be a finite group and a prime p divide | G | . Then 
G has an element of order p and consequently a subgroup of order p. 

Proof Consider the set X = {(gi, g 2 , • • • , g P ) : gi e G and g\g 2 • • • g p = 1}. In 
forming an element in X , we may take the elements g\, g 2 , . . . , g p of G such 
that g p = (gw-gp-i)- 1 . Thus \X\ = \G\P~\ Now p\\G\ =► p\\X\. Let 
a be the cycle (123 •••/?) in S p . Let a act on X by a • (gi, g2, • • • ,gp) = 
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(ga(l), ga(2), ■ ■ ■ , ga(p)) = (g2, g3, ■ ■ ■ , gp, gl) S X, since gi (g 2 ■ ■ ■ g p ) = 1 =>• gl = 
{gig3---gp)~ l => (g2g3---g P )g\ = 1- We consider the subgroup (a) of 5' ; , to 
act on X by iteration. Now | (or) | = p. Then by using Theorem 3.2.2, \X\ = 
\X( a )\ (mod p). Hence p\\X\ => p ||X( a >| => there must be at least p elements in 
X( a ) =>► 3 some a e G, a ^ 1 such that (a,a, . . . ,a) e X( a ) a p = 1 =>- order of a 
is /7 (^) is a subgroup of G of order p. □ 

Corollary 1 Let G be a finite group. Then G is a p-group iff\G\ is a power of p. 

Proof Let G be a finite group such that \G\ = p r . Then for each a e G, 0((a)) = 
0(a) and 0(a)\p r => 0(a) is a power of p for each a e G =>- G is a /7-group. 

Conversely, let G be a /7-group. Then no prime t p ) divides |G|, otherwise G 
would contain an element of order t, a prime number f^p by Theorem 3.2.3 implying 
a contradiction, since every element of G has order, a power of p, implying t = p. 
Consequently, \G\ is a power of p. □ 

We now apply Cauchy’s Theorem to prove a partial converse of Lagrange’s The- 
orem. 

Corollary 2 Let G be a finite abelian group of order n and m be a positive integer 
such that m divides n. Then G has a subgroup of order m. 

Proof If m = 1, then {1} consisting of the identity element of G is the required 
subgroup of G.lfn = 1, then m = n = 1 and the result is trivial. So we assume 
that m > 1, n > 1 and prove the corollary by induction on order n of G. If n = 2, 
then m = 2 and hence G is the required subgroup. We now assume that the corol- 
lary is true for all finite abelian groups of order r satisfying 2 < r <n. Let p be 
a prime such that p divides m. Then there exists an integer s such that m = ps. 
Hence G has a subgroup H of order p by Cauchy’s theorem. Then G/H is a group, 
as H is a normal subgroup of the abelian group G. Consequently, 1 < \G/H\ = 
\G\/\H\ < | G | and \G/H\ — nip. Again n = mt for some positive integer t. Hence 
\G/H\ = mt / p = st implies that s divides \G/H\. This shows by induction hypoth- 
esis that the group G/H has a subgroup K/H (say) such that \K/H\ = s, where 
K is a subgroup of G. This implies that |^T| = \K/H\\H\ = sp =m. Hence K is a 
subgroup of G of order m . □ 


3.2.2 Class Equation and Sylow’s Theorems 

We are now in a position to define the class equation of a finite group. Let G be a 
finite group and X a finite G-set. We now consider the special case of equation (3.4) 
in which X = G and the action of G on itself is by conjugation, i.e., for g e G, 
x t-^ gxg~ l (cf. Example 3.1.3(iii)). Then Xq = {x e G : gxg~ l = x Vg e G} = 
[x € G : gx = xg Vg e G} = Z(G), the center of G. 
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If m = |Z(G)| and rii is the number elements in the i th orbit of G under conju- 
gation, i.e., rii = | orb (x;) |, then (3.4) becomes 


|G| — m + n m + 1 H b n r . (3.5) 

Remark 1 rii divides \G\ for m + 1 < i < r, since |orb(jq-)| = [G : G Xi ] is a divisor 
of|G|. 

Remark 2 Equation (3.5) is called the class equation of G. Each orbit of G under 
conjugation by G is a conjugate or conjugacy class in G. 

The class equation is now effectively applied to study a certain class of finite 
groups. Using the class equation, we are going to prove three celebrated theo- 
rems, essentially due to Norwegian mathematician L.M. Sylow (1832-1918), to 
understand the structure of an arbitrary finite group. There are very few theorems 
which show the existence of subgroups of prescribed order under specific condi- 
tions. Among them the best partial converse to Lagrange’s Theorem is the First 
Sylow Theorem. This is a basic theorem and is largely used. The First Sylow Theo- 
rem leads to the Second and Third Sylow Theorems under specific conditions. The 
First Sylow Theorem describes the subgroups of orders which are some powers of 
prime integer. The Second Sylow Theorem gives an interesting relation among dif- 
ferent Sylow p -subgroups of a finite group. The Third Sylow Theorem prescribes 
the number of Sylow p -subgroups. 

As a first step to prove the Sylow Theorems we now apply group action to deter- 
mine the number of distinct conjugate classes of a subgroup of a finite group. Let 
G be a finite group and S the collection of all subgroups of G. We make S into a 
G-set by the action of G on S by conjugation i.e., if H e <S, then g H = gHg~ l . 
Consider the normalizer N(H) = {g e G : gHg~ l = H } of H in G. Then N(H) 
is a subgroup of G and H is a normal subgroup of N(H ) such that N(H ) is 
the largest subgroup of G having H as a normal subgroup. Clearly, the element 
H of S is a fixed point under the above conjugation iff H is normal in G and 
orb(H ) is precisely the set of all subgroups of G which are conjugate to H , i.e., 
the conjugate class of H. The isotropy subgroup Gh = N(H) => \G/N(H)\ = 
number of distinct conjugate classes of H in G. Such results are very important in 
algebraic topology. 

Lemma 3.2.1 Let H be a p- sub group of a finite group G. Then 
[N(ff) : H] = [G : H] (mod p). 

Proof Let S be the set of all left cosets of H in G and H act on S by left trans- 
lation: h • (xH) = (i hx)H . Then S becomes an H- set and \S\ = [G : H]. Consider 
S h = {xH eS: h(xH) =xHVhe H}. Thus Vh e H, xH = h(xH ) x~ l hx = 
x~ x h(x~ x )~ x e H 'ih e H x~ x e N(H) x e N(H) => left cosets in Sh are 
those contained in N(H). The number of such cosets is [ N(H ) : H] and hence 
\Sh\ = [N(H) : H]. Again H is a p-g roup =>► H has order a power of p by Corol- 
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lary 1 of Theorem 3.2.3 shows that \S\ = |<S//| (mod p) by Theorem 3.2.2. Conse- 
quently, [G : H] = [ N(H ) : H] (mod p) i.e., [N(H) :H] = [G:H] (mod p). □ 

Corollary 3 Let H be a p-subgroup of a finite group G. If p\[G : H], then 
N(H ) / H. 

Proof Clearly, p divides [N(H) : H] by Lemma 3.2.1. So, H / N(H). □ 

We are now equipped to prove the First Sylow theorem, which gives the existence 
of p-subgroups of G for any prime power dividing |G|. 

Theorem 3.2.4 (First Sylow Theorem) Let G be a finite group of order p n m , with 
m,n >1, p prime and gcd(/7, m) = 1. Then G contains a subgroup of order p l for 
each i satisfying l < i < n and every subgroup H of G of order p l is a normal 
subgroup of a subgroup of order p l+l for 1 <i<n. 

Proof We prove the theorem by induction, p \ \ G | =>- G contains a subgroup of order 
p by Theorem 3.2.3. We assume that H is a subgroup of order p 1 (1 < i < n). 
Then p\[G : H] =>► p\[N(H) : H] by Lemma 3.2.1. Since H is a normal subgroup 
of N(H ), we can form the factor group N(H)/H such that p\\N(H)/H |. Again 
the factor group N(H)/H has a subgroup of order p by Theorem 3.2.3. Then this 
group is of the form T /H , where T is a subgroup of N(H) containing H . Since 
// is normal in N(H ), // is necessarily normal in T . Finally, |T| = |//||r///| = 
p l p = p l+l . Thus we prove that the existence of a subgroup H of order p l for 
i < n implies the existence of a subgroup T, of order /7* +1 , in which H is normal. 
Thus the theorem follows by an induction argument. □ 

Corollary 4 Let G be a finite group and p a prime. If p l \ |G|, then G has a sub- 
group of order p l . 

Definition 3.2.2 A /7-subgroup P of a finite group G of order p n m , with m, n > 1 
and gcd(/7, m) — 1, is said to be a Sylow /7-subgroup of G iff P is a maximal p- 
subgroup of G, i.e., P c // C with // a /7-subgroup of G implies P = H. 

Remark If G is a finite group such that |G| = /7 n m where m,n > 1 and 
gcd(m,/7) = 1, then Theorem 3.2.4 shows that the Sylow /7-subgroups of G 
are precisely those subgroups of order p n . Moreover, every conjugate of a Sy- 
low /7-subgroup is a Sylow /7-subgroup. Its converse is also true by Theo- 
rem 3.2.5. 

Theorem 3.2.5 (Second Sylow Theorem) Let P and H be Sylow p -subgroups of 
a finite group G. Then P and H are conjugate subgroups of G. 

Proof Let S be the set of all left cosets of P in G and let H act on S by (left) trans- 
lation, i.e., h(xP ) = hxP Wh e H. So, S is a left H- set. Then by Theorem 3.2.2, 
\Sh\ = 1 5 1 (mod p). Since H is a Sylow /7-subgroup of G, |S| = [G : H] is not 
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Table 3.1 The number of groups of order <32, up to isomorphisms 


|G| 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 17 

Number 
of groups 

1 

1 

2 

1 

2 

1 

5 

2 

2 

1 

5 

1 

2 

1 

14 1 


|G| 

18 

19 

20 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

Number 
of groups 

5 

1 

5 

2 

2 

1 

15 

2 

2 

5 

4 

1 

4 

1 

51 


divisible by p, so \Sh\ ^ 0- Hence 3xP e Sh • Now xP e Sh O h(xP) = xP 
Wh e H o x~ l hxP = P Wh e H . Hence x~ l Hx xPx~ l . Again P 

and H are Sylow / 7 -subgroups of G => \H\ = |P| = \xPx~ l \ =>• 77 = rPi -1 P 
and // are conjugate subgroups of G. □ 

The following theorem prescribes the number of Sylow p - subgroups of a finite 
group. 

Theorem 3.2.6 (Third Sylow Theorem) If G is a finite group and p divides \G\, 
then the number of Sylow p-subgroups is congruent to 1 modulo p and divides \G\. 

Proof Let P be a Sylow / 7 -subgroup of G and S the set of all Sylow p -subgroups 
and P act on S by conjugation. Then by Theorem 3.2.2, \S\ = \Sp | (mod p), where 
*Sp = {g e <S : xQx~ l = Q Vv e P}. Since P e Sp , Sp 0. We claim that Sp — 
{P}. Now Q^Sp ^ xQx~ l = gVxGP=^Pc N(Q). Both P and Q are Sylow 
/ 7 -subgroups of G and hence of N(Q). Consequently P and Q are conjugates in 
N(Q) by Theorem 3.2.5. But since Q is normal in N(Q ), it is only conjugate in 
N(Q). This implies P = Q. Consequently, Sp = {P}. Then 

|<S| = 1 (mod p). 

Now let G act on S by conjugation. Then the last part follows. □ 

Remark We now list all the groups G of order <7 up to isomorphisms. 

If |G| =2,thenG = Z 2 . 

If \G\ =3,thenG = Z 3 . 

If | G | = 4, then G = Z 4 or Z 2 ® Z 2 . 

If \G\ =5,thenG = Z 5 . 

If \G\ = 6 , then a non-abelian group appears for the first time and G = S 3 or 

Z 2 © Z 3 . 

If \G\ =7,thenG = Z v . 


Table 3.1 gives the number of groups of order <32, up to isomorphisms. 
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3.2.3 Exercises 

Exercises-I 

1. If G is a group of order p r , r > 1, show that G has a normal subgroup of order 

p r ~ l . 

2. Suppose G is a group and H is a subgroup of the center Z(G). Show that G is 
abelian if G/H is cyclic. 

3. Show that for a prime number p , every group of order p 2 is abelian. 

[7/m/\ Let G be a group of order p 2 and Z(G) its center. Then Z(G) ^ {1}. 
If Z(G) = G, then G is abelian. Suppose Z(G) / G, then |G/Z(G)| = /7 =>► 
G/Z(G) is cyclic G is abelian by Exercise 2.] 

4. Show that a non-abelian group G of order /? 3 has a center of order /? (p is 
prime). 

[7/m£. Z(G) 7 ^ {1}. Also Z(G) / G, since G is non-abelian. |Z(G)| = p 2 =>► 
|G/Z(G)| = p => G/Z(G ) is cyclic G is abelian (by Exercise 2) =>► a con- 
tradiction |Z(G)| = p.\ 

5. If P is a / 7 -Sylow subgroup of a finite group G and x e G, show that iTx -1 is 
also a / 7 -Sylow subgroup of G. 

[//m/\ Suppose |G| = / 7 m zz, where p is prime such that p]n and if P is a 
/ 7 -Sylow subgroup of G, then | P | = / 7 m . For v g G, ifi -1 is a subgroup 
of G. Define a mapping / : P — > iTr -1 , y i-> xyx -1 Vy g P. Show that / is 
bijective. Then |P| = p m = \xPx~ l \ => xPx~ l is a / 7 -Sylow subgroup of G.] 

6 . If a finite group G has only one / 7 -Sylow subgroup P, show that P is normal 
in G. Its converse is also true. 

[Hint. Using Exercise 5, xPx~ l is a / 7 -Sylow subgroup Vi e G^iPi -1 = 
P (by hypothesis) =>► P is normal in G.] 

7. Show that a group G of order 30 is not a simple group. 

[Hint. |G| = 30 = 5*3*2. Prove that G has either a normal subgroup of 
order 5 or a normal subgroup of order 3. Hence G is not a simple group.] 

8 . Examine whether a group G of order 56 is simple or not. 

[Hint. Show that G has at least one non-trivial normal subgroup. So, G is 
not simple.] 

9. Show that a group G of order 108 is not simple. 

[Hint. Prove that G has a non-trivial normal subgroup of order 9.] 

10. Let G be a group of order pq , where p and q are prime numbers such that 
p > q and q](p — 1). Show that G is cyclic. 

[Hint. | G | = pq. Let n p be the number of Sylow /7-subgroups of G. Then 
n p \pq and n p = pr + 1, (r = 0, 1, 2, . . .) =>► pq = sn p = s(pr + 1) for some 
positive integer s ^ s = pq — spr = p(q — sr) = pt (say), where t = q — sr < 
p as q < p. 

Hence pq = s(pr + 1) = pt(pr + 1) =>► q = t(pr + 1) =>► r = 0 (as q < 
/?) n p = 1 G contains only one Sylow / 7 -subgroup // (say) such that 
1 77 1 = p => H is normal in G. Again let be the number of Sylow q- 
subgroups of G. Then n q \pq and n q = qr r + 1, ( r' = 0,1,2, ...)=> pq = 
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s'n q = s'(qr' + 1 ) for some positive integer s' ^ s' = q(p — s'r') = qt' 
(say), where t' = p — s'r' . Hence pq = qt'iqr' + 1) =>► p = t'(qr' + 1). Since 
q](p — 1), p = t'(qr' + l)=^r / = 0=^ftg = l=^G contains only one Sylow 
<7 -subgroup K (say) such that \K\ = q => K is normal in G. Now proceed to 
show that G is cyclic.] 

11. Let G be a group containing an element of finite order n(> 1) and exactly two 
conjugate classes. Show that \G\ =2. 

12. Show that any group G of order 35 is cyclic. 

13. If the order of a group G is 42, prove that G contains a unique Sylow 7- 
subgroup which is normal. 

14. Show that the center Z(G) of a non-trivial finite p -group G contains more than 
one element. 

15. Prove that there exists no simple group of order 48. 

16. Prove that any group of order 15 is cyclic. 

17. Prove that a group of order 65 is cyclic. 

18. Prove that no group of order 65 is simple. 

19. If a non-trivial finite group G has no non-trivial subgroups, then prove that G 
is a group of prime order. 

20. Let G be a finite group. If G has exactly one non-trivial subgroup, then prove 
that the order of G is p 2 for some prime p. 

21. Prove that every group G of order 45 has a unique Sylow 3 -group of order 9, 
which is normal. 

22. Let G be a group of order p n (p is prime and n > 1). Then prove that G is not 
a simple group. 

23. Let G be a group of order pq , where p and q are prime numbers. Then prove 
that G is not a simple group. 

24. Prove that any group of order 2 p ( p is a prime) has a normal subgroup of 
order p. 

25. Identify the correct alternative(s) (there may be more than one) from the fol- 
lowing list: 

• Let p be a prime number and GLso(F p ) be the group of invertible 50 x 50 
matrices with entries from the finite field F p . Then the order of a p -Sylow 
subgroup of the group GLso(F p ) is: 

(a) p 50 (b) p 1250 (c) p 1 25 (d) p' 225 . 

• If G is the group given by G = Zio x Z 15 , then 

(a) G contains exactly one element of order 2; 

(b) G contains exactly 24 elements of order 5 ; 

(c) G contains exactly five elements of order 3 ; 

(d) G contains exactly 24 elements of order 10. 

• If G is the group given by 64 x S 3 , then 

(a) G has a normal subgroup of order 72; 

(b) a 3 -Sylow subgroup of G is normal; 
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(c) a 2-Sylow subgroup of G is normal; 

(d) G has a non-trivial normal subgroup. 

• Which of the following statement(s) is (are) valid? 

(a) Any group of order 15 is abelian; 

(b) Any group of order 25 is abelian; 

(c) Any group of order 65 is abelian; 

(d) Any group of order 55 is abelian. 


3.3 Actions of Topological Groups and Lie Groups 

The actions of topological groups and Lie groups are very important in topology 
and geometry. Many well known geometrical objects are obtained as orbit spaces. 

Definition 3.3.1 A topological group G is said to act on a topological space X from 
the left iff there is a continuous function o : G x X -> X, denoted (g, x) \-> g • x (or 
gx), Vx e X, Vg e G such that 

(i) for any xeI, 1 • x = x, where 1 is the identity element of G; 

(ii) for any x e X, gi, g 2 e G, (gig 2 ) ■ x = g\ • (g 2 ■ x). 

a is said to be a topological action (or just an action) of G on X from the left. 

The ordered triple (X, G, a) is called a topological transformation group and X 
is said to be a G -space. A right action of G on X is defined in a similar way. There 
is a bijective correspondence between the left and right G -space structures. 

Example 3.3.1 (i) The space K n is a left GL(n, R) space under the usual multipli- 
cation of matrices. 

(ii) The space R n is also a left 0(n)~ space [see Example 2.9.1 of Chap. 2]. 

Definition 3.3.2 Let X be a left G-space and X mod G be the set of all orbits Gx , 
Vx e X, with the quotient topology, i.e., the largest topology such that the projection 
map p : X -> X mod G, x i-> Gx is continuous. The space X mod G is called the 
orbit space of the action o of the transformation group ( X , G, a) and p is also 
called an identification map. In particular, if A is a Hausdorff space and G is a finite 
topological group acting on X in such a way that g • x = 1 , for some x e X implies 
g = 1, then the action is said to be free. 

Some Important Examples 

Example 3.3.2 (Real projective space) Let S n = {x eR" + 1 : ||x|| = 1} be the 
ft-sphere in R w+1 , for ft > 1 and A : S n -> S n , x i-> — x be the antipodal map. Then 
A 2 = A o A = I (identity map). The group G = {A, I } is a group of homeomor- 
phisms on S n and acts on S n . The orbit space S n mod G is called the real projective 
n-space , denoted R P n . This action is free. 


152 


3 Actions of Groups, Topological Groups and Semigroups 


Example 3.3.3 (Complex projective space) Let the circle group S l act on S 2n+l = 
{(zo, zi , • • • , Zn) e C n+1 : YH= o ki I 2 = 1} continuously under the action 

Z • (Z0,Z1, • l-> (ZZ0,ZZ1, ...,ZZ n )- 

The orbit space S 2n+l mod S 1 is called the complex projective n- space, denoted 
C P n . This action is free. 

Example 3.3.4 (Torus) Let / : R— >R, jci->jc-|-L Then / is a homeomorphism 
and for each integer n, f n : R — > R is also a homeomorphism. The infinite cyclic 
group Z = (/) endowed with the discrete topology acts on a group of homeomor- 
phisms on R. This action is free and the orbit space R mod Z is homeomorphic to 
the circle group S l . Again consider the action of the discrete group Z x Z on R x R 
by setting (/, g)(x, y) = (x + 1, y + 1). Then (f m ,g n )(x, y) = (x + m, y + n) for 
every pair of integers (, m,n ). This action is free and the orbit space is T = S l x S 1 , 
which is called a 2-torus (or simply torus). An n -torus is defined similarly as an 
orbit space of R n ,Vn >2. 

Definition 3.3.3 A real Lie group G is said to act on a differentiable manifold M 
from the left iff there is a C°° function a : G x M — > M, (g, x) t-^ g • v (or gx) 
Vi gM and Vg e G such that 

(i) for any x e M, l x = x, where 1 is the identity element of G; 

(ii) for any x e M, gi,g 2 eG, (gig 2 ) • x = gl < (g 2 • v). 

Definition 3.3.4 A complex Lie group G is said to act on a complex manifold M 
from the left iff there is a holomorphic function a : G x M — > M, (g, v) gv (or 
gx), Vi gM and Vg e G such that 

(i) for any x e M, 1 • x = x, where 1 is the identity element of G; 

(ii) for any x € M, gig 2 € G, (gig 2 ) ■ x = gi ■ (g 2 ■ x ). 

Right actions of G on M are defined dually. 

Definition 3.3.5 If a Lie group G acts on manifolds, then G -manifolds (homoge- 
neous), G-homomorphisms, G -isomorphisms, G -automorphisms etc. are defined in 
usual ways. 


3.3.1 Exercises 

Exercises-II 

1. Let A be a homogeneous left G-set. Prove the following: 

(a) If o : X — >► X is an automorphism of X , then the points v and cr(x) have 
the same isotropy subgroup; 

(b) if v and y G X have the same isotropy subgroup, then there exists an auto- 
morphism cr of X such that o(x) = y; 
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(c) if a and p are automorphisms of X such that for some xeA,cr(x) = p(x), 
then a — p\ 

(d) a group A of automorphisms of X is the entire group of automorphisms iff 
for any two points x, y e X which have the same isotropy subgroup, there 
exists an automorphism o e A such that o(x) = y. 

2. Let A be a homogeneous left G-set and H the isotropy subgroup of G corre- 
sponding to the point ioeX. Prove that A(X ), the group of automorphisms of 
X is isomorphic to N(H)/H , where N(H) is the normalizer of H. 

3. A topological group G is a set G with a group structure and topology 

on G such that the function G x G G, ( s,t ) i-> st~ x is continuous. 

Show that this condition of continuity is equivalent to the statement that the 
functions G x G -> G, (s,t) t-> st and G -> G, s \-> s~ l are both con- 
tinuous, i.e., the group operations in G are continuous in the topological 
space G. 

For a topological group G, a left G -space is a topological space X together 
with a map GxI^X (The image of (s, x) e G x X under the map is sx) 
such that 

(a) for each x e X, s,t e G, (st)x = s(tx)\ 

(b) for each xgI, the relation lx = x holds. 

Similarly, a right G -space is defined. 

Show that there is a bijective correspondence between the left and right G- 
space structures. 

4. Two elements x,y e X in a left G-space are called G-equivalent iff 3s e 
G such that sx = y. Show that this relation is an equivalence relation. 
Suppose Gx = {all sx : s e G}, the equivalence class determined by x e 
X and X mod G = {all Gx : x e X}, with the quotient topology, i.e., the 
largest topology such that the projection p : X -> X mod G is continu- 
ous. 

Prove that the map X -> X (X is a G-space), x i-> xs is a homeomorphism 
and the projection p : X — > X mod G is an open map (see Simmons (1963, 
p. 93)). 

5. Let A be a G-space and X mod G be the orbit space. Show that the projection 
map p : X — >► X mod G is an open map. 

[Hint. Let V be an open subset of X. Then p~ l p(V) = U geG 8 V an °P en 

set, as it is a union of open sets gV. Hence p(V) is open in X mod G for each 

open set V of X.] 

6. Let a compact topological group G act on a space X and X mod G be its orbit 
space. Show that 

(a) if X is Hausdorff, then X mod G is also so; 

(b) if X is regular, then X mod G is also so; 

(c) if X is normal, then X mod G is also so; 

(d) if X is locally compact, then X mod G is also so. 
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7. Let A be a G-space, where G is a compact topological group and A is a Haus- 
dorff space and G x be its isotropy group at x. Show that the continuous map 
/ : G/G x orb(x), gG x \-^ gx is a homeomorphism. 

Hence show that the spaces U(n)/U(n — 1) and S 2n ~ l are homeomorphic, 
where U(n) is the unitary group. 

[Hint. Use Lemma 3.1.1 to show that / is a bijection, which is a homeomor- 
phism in this case. The isotropy group at (0, 0, . . . , 0, 1) is U (n — 1).] 

8. Show that the map / : GL(n , R) x K n -> R^, defined by (A, X) i-> AX, the 
product of the n x n matrix A e GL(n , R) and n x 1 column matrix X e R n is 
an action of GL(n , R) on R" . 

Hence show that the action of the group 0(n) of all orthogonal real matrices 
is transitive on S' 1-1 and the spaces 0(n)/0(n — 1) and S n ~ l are homeomor- 
phic. 

9. (Irrational flow). Let a be a fixed irrational number, T = S l x S l be the torus 
and R be the additive group of reals. Define an action a : R xT ^ T , 


r ■ {e 2 nix ,e 2 * iy ) ^ (, 


' e 2: ri(x+r) e 2ni 


e 2ni{y+oir 


Show that a is a free action. 

10. Let A be a G-space. Show that for each g e G, the map (f) g : X X, x i-> gx 
is a homeomorphism and the map ^ : G -> Homeo(X), g \-> (p g is a group 
homomorphism. 

[Hint. (j) g is a bijection by Theorem 3.1.1. Since the action is continuous, it 
follows that (j) g is a homeomorphism.] 

11. If a real Lie group G acts on a differentiable manifold M, show that for each 
g e G, the function cf) g : M — >► M, x \-^ gx is a diffeomorphism. 

12. Let G be a Lie group and M be a homogeneous G -manifold. Show that 

(a) M is G -isomorphic to the G -manifold G/H for some closed Lie subgroup 
H of G; 

(b) if i// : M — > M is a G -automorphism, then v and ^r(jc) determine the same 
closed Lie subgroup of G. 

13. Let G be a topological group and X a topological space. Suppose the group G 
acts on the set X. If this action G x X — >► X is continuous, then G is said to act 
on X. 

Prove the following: 

Let G be a topological group acting on a topological space X. Then 

(i) H = {h e G : hx = x Vv e X} is a closed normal subgroup of G. 

(ii) If G is compact, H is Hausdorff and G x is the isotropy group of x, the 
continuous map f : G/G x Gx defined by f(gG x ) = gx is a homeo- 
morphism. 

[Hint. Use Lemma 3.1.1 to show that / is a bijection. Then use the re- 
sult that if X is compact and Y is Hausdorff and / : X — >► Y is continuous 
and bijective. Hence / is a homeomorphism.] 
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3.4 Actions of Semigroups and State Machines 

Actions of semigroups are important both in mathematics and computer science. 
The theory of machines developed so far has largely influenced the development of 
computer science and its associated language. In this section we discuss semigroup 
actions and their applications to the theory of state machines to unify computer 
science with the mainstream mathematics. 

Definition 3.4.1 A semigroup S is said to act on a non-empty set X from the left iff 
there is a function cr : S x X ^ X, denoted by (a, x) i-> ax (or a • x) such that 

(i) for any v e X, a, b e S, (ab)x = a(bx)\ 

(ii) if identity 1 e S, then for any x e X, lx = x. 

a is said to be an action of S on X from the left. 

Similarly, a right action of S on X is defined. 

There is a feeling that much of abstract algebra is of little practical use. However, 
we apply semigroups to the theory of machines and establish some relations between 
a state machine homomorphism and a homomorphism of the corresponding trans- 
formation semigroups. In this section we describe algebraic aspect of finite state 
machines. We now proceed to unify finite state machines with semigroup actions. 
So we need the concept of transformation semigroups, which involves an action of 
a semigroup on a finite set and another condition on the semigroup action as given 
below. 

Definition 3.4.2 A transformation semigroup is a pair (Q, S) consisting of a finite 
set 2, a finite semigroup S and a semigroup action X : Q x S -> Q , (q,s) \-^ qs 
which means 

(i) q(st) = (qs)t, Vq e Q,Vs,t e S; and such that 

(ii) qs=qtVqeQ=^s = t,s,teS. 

Definition 3.4.3 If X and Y are non-empty sets and R : X — ► Y is a relation such 
that (x, y) e R and (x, z) e R =>- y = z, where x e X and y, z e Y, then R is called a 
partial function. A partial function R : X — > Y is said to be a function iff dom (R) = 
{x e X : (x, y) e R for some y e Y} = X. 

Definition 3.4.4 A state machine or a semiautomation is an ordered triple [i = 
(Q, X, F), where Q and X are finite sets and F \ Qx X ^ Q is a partial function. 

Definition 3.4.5 A state machine /jl = (Q, X, F) is said to complete iff F : Q x 
X — >► Q is a function. 

Such systems can be successfully investigated by using the algebraic techniques, 
an important achievement of modern algebra. 
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Example 3.4.1 (Cyclic state machine) Let m,n be positive integers and Q = 
{0, 1, 2, . . . , m + n — 1}, then the diagram of a cyclic state machine is given below: 


0 ^ 1 

o 


> 2 • • • 

G 


— >■ n — 1 

o 


G 

n + 1 


<7 


ft 




ft + /ft — 1 


<7 


where ^(O, cr) = 1, F(l, cr) = 2, . . . , F(n — 1, a) = ft, ,F(ft, cr) = ft + 1, . . . , F(n + 
/ft — 2, a) = ft + /ft — 1 and F(n -\-m — l,a) =n. 


We now associate a semigroup with a state machine. 

Let pi = {2, 27, F} be a state machine with a symbol a g 27 when it is in some 
state q (say) G Q. The machine then moves to the state F(q,cr) g Q. We may 
equally define F 0 : Q -> Q by F a (q) = F(q,cr), Wq e Q. 

Let /i = (Q, F, F) be a state machine. Consider the set F + of all words of length 
>1 in the alphabet F . Define an equivalence relation p on F + by apfi F a = Fp. 
An equivalence relation p on a semigroup S is said to be a congruence relation on 
S iff (a, b) e p =>• (cr/, c/?) g p and (r/c, Z/c) e pVc e S. 

Proposition 3.4.1 F + / p is a semigroup. 

Proof F + admits a semigroup structure by concatenation of words. Moreover, p is 
a congruence relation on 27 + and hence 27 + /p, the set of all congruence classes (a) 
becomes a semigroup under the binary operation given by 

(a)(b) = ( ab ). □ 

Definition 3.4.6 The quotient semigroup F + / p is called the semigroup of the state 
machine pi and is denoted by S(pi). 

Let pi = (2, F, F) be a state machine and PF(Q ) be the semigroup of all partial 
functions from Q to Q, under the usual composition of relations on Q. 

Each cr e F defines a partial function F a : g — > 2 =>► 3 a natural function T 7 : 
27 — > PF(Q), o F a . Let (F(/x)) denote the subsemigroup of PF(Q) generated 
by the set of partial functions {F a : a e 27}. 

Corresponding to a state machine pi = (Q, F , F), there is a transformation semi- 
group (2, SX/x)), denoted by 72>(/x), called the transformation semigroup of pi. Con- 
versely, each transformation semigroup determines a state machine. Suppose T = 
(Q, S) is a transformation semigroup. We define a state machine pi = (Q, S, F), 
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where F : Q x S — > Q is defined by F(q, s) = qs , Vq e Q and Vs e S. /a is called 
the state machine of F, denoted by SM(T). 

Definition 3.4.7 Let /a = (2, F) and /a ' — (Q f , F r ) be state machines. 
A pair of maps (a, ft) : /a ^ /a' is said to be a state machine homomorphism iff 
a . Q O’ and p : Z Z’ is a pair of maps such that ao F a = F^^ oa,Vcr e Z, 
where F a : 2 —> Q is defined by F a (q) = F(q, a), Vq e 2- 

Definition 3.4.8 Let /a = ( 2 , F) and /a ' = (2 r , 2F, F') be state machines. 
A state machine homomorphism (a, P) : /a /A f is said to be 

(i) a monomorphism iff cu and p are both injective; 

(ii) an epimorphism iff a and ft are both surjective; 

(iii) an isomorphism iff (a, ft) is both a monomorphism and an epimorphism (writ- 
ten \a = /a'). 


3.4.1 Exercises 

Exercises-III 

1. Let fA = (Q,Z,F) be a state machine. Show that (F(/x)) = S(/a), where S(/a) 
is the semigroup of the state machine /a. 

2. Let (Q,S) be a transformation semigroup. Show that there is a natural em- 
bedding 0 : S' — ^ FF(2) and conversely, given any set 2 and a subsemigroup 
5 c PF(Q ), (2, 5) is a transformation semigroup. 

[ Flint. Define 0 S \ Q ^ Q, q \-> qs for each q € Q and each s e S. Then 
0 : S ^ PF(Q ), s i-> is a semigroup monomorphism.] 

3. Let /a = (2, 27, F) be a state machine. Show that there exists a state machine 
monomorphism (a, ft) : [A -> SM(TS(/a)), where SM(TS(/a)) is the state ma- 
chine of the transformation semigroup FS^/x) of /x. 

4. Let [A — (Q, Z, F) and // = (Q' , Z' , F') be complete state machines and 
(a, ft) : /x -> \a’ a state machine homomorphism such that a is onto. Show that 3 
a transformation semigroup homomorphism (f a ,gp) • FS(/x) -> TS(/a 

5. Let A = (2, 5) be a transformation semigroup. Show that TS(SM(A)) = A. 

[Hint. Suppose SM(A ) = (Q, S, F) is the state machine of A and B is the 
semigroup generated by SM(A). Then the map 6 : S B,s \-+ 6 S (see Ex. 2) is 
a semigroup isomorphism. Hence the pair (. Iq,0 ) : A -> FS(SM(./4)) is a trans- 
formation semigroup isomorphism.] 


3.5 Additional Reading 

We refer the reader to the books (Adhikari and Adhikari 2003; Artin 1991; Bredon 
1993; Fraleigh 1982; Ginsburg 1968; Herstein 1964; Holcombe 1982; Hungerford 
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1974; Jacobson 1974, 1980; Lang 1965; Rotman 1988; Spanier 1966; van der Waer- 
den 1970) for further details. 
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Chapter 4 

Rings: Introductory Concepts 


In the earlier chapters we have studied groups and their applications. Another fun- 
damental concept in the study of modern algebra is that of a ring. Rings also serve 
as one of the fundamental building blocks for modern algebra. A group is endowed 
with only one binary operation but a ring is endowed with two binary operations 
connected by distributive laws. Fields form a very important class of rings. Un- 
der usual addition and multiplication, the set of integers Z is the prototype of 
the ring structure and the sets Q, R, C (of rational numbers, real numbers, com- 
plex numbers) are the prototypes of the field structures. The concept of rings arose 
through the attempts to prove Fermat’s Last Theorem and was initiated by Richard 
Dedekind (1831-1916) around 1880. David Hilbert (1862-1943) coined the term 
‘ ring ’. Emmy Noether (1882-1935) developed the theory of rings under his guid- 
ance. Commutative rings play an important role in algebraic number theory and 
algebraic geometry, and non-commutative rings are used in non-commutative ge- 
ometry and quantum groups. This chapter starts with introductory concepts of rings 
and their properties with illustrative examples. Some important rings such as rings 
of power series, rings of polynomials, Boolean rings, rings of continuous functions, 
rings of endomorphisms of abelian groups are also studied in this chapter. Further 
study of theory of rings is given in Chaps. 5-7. 


4.1 Introductory Concepts 

The abstract concept of a ring has its origin in the set of integers Z. The basic dif- 
ference between a group and a ring is that a group is of one-operational algebraic 
system and a ring is of two-operational algebraic system. Despite this basic differ- 
ence, while studying ring theory, many techniques already used for groups are also 
applied for rings. For example, we obtain in ring theory the appropriate analogues 
of group homomorphisms, normal subgroups, quotient groups, homomorphism the- 
orems, etc. Integral domains and fields are special classes of rings. 
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Definition 4.1.1 A ring is an ordered triple (R,+, •) consisting of a non-empty 
set R and two binary operations ‘+’ (called addition) and (called multiplication) 
such that 

(i) ( R , +) is an abelian group; 

(ii) ( R , •) is semigroup, i.e., (a • b) • c = a • (b • c) for all a, b, c e R; and 

(iii) the operation ‘-’is distributive (on both sides) over the operation 4 +\ i.e., 
a-(b-\-c) = a- b-\-a-c and (Z? + c)-<z = Z?-<z + c- <z for all a,b,c e R 
(< distributive laws). 

We adopt the usual convention of designating a ring (R, +, •) simply by the set 
symbol R and assume that ‘+’ and *•’ are known. If in addition, the operation 
is commutative i.e., a • b = b • a Wa,b e R, R is said to be a commutative ring. 
Otherwise, R is said to be non-commutative. If there exists an element 1 e R such 
that 1 • a = a • 1 = a,Va € R, lis said to be an identity element of R. The identity 
element is then unique. 

Remark A ring R may or may not contain identity element. If there exists an ele- 
ment l(r) e R such that / • a = a (a • r = a) for all a e R, then / (r) is called a left 
(right) identity element of R. It may happen that a R may have a left (right) identity 
but no right (left) identity. But if both of them exist, then they are equal and we say 
that the identity element exists in R. On the other hand, if R contains more than one 
left (right) identity element, then R cannot contain the identity element. 

As ( R , +) is an abelian group, R has a zero element denoted by 0, and every 
a e R has a unique (additive) inverse, —a. Following usual convention, we can 
write ab in place of a • b and a — b in place of a + (—b). 

Theorem 4.1.1 If R is a ring , then for any a,b,c e R , 

(i) 0a = a0 = 0; 

(ii) a(—b) = (—a)b = —(ab); 

(iii) (—a)(—b)=ab; 

(iv) a(b — c) = ab — ac , (b — c)a = ba — ca. 

Proof (i) From 0 = 0 + 0, it follows that 0a = (0 + 0 )a = 0a + 0a => 0a = 0 by the 
cancellation law for the additive group (/?,+). Similarly, it follows that aO = 0. 

(ii) a(—b) + ab = a(—b + b) = aO = 0 =>- a(—b) = —(ab) and (—a)b + ab — 
(—a + a)b = 0/? = 0 =>- (—a)/? = —(a/?). 

(iii) (—#)(—£) = —(#(—£)) by (ii). Now a(—b) + ab = a(—b + b) = a0 = 0^> 

— (a(—b)) — ab. Consequently, (—a)(—b) = a/?. 

(iv) follows form (ii). □ 

For each positive integer ft, we define the ftth natural multiple na recursively as 
follows: 1 a = a and na = (n — 1 )a + a when n > 1. If it is agreed to let (—na) = 

— (na), then the definition of na can be extended to all integers. 
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Theorem 4.1.2 If R is a ring , then for any a,h e R and m,n e Z, 

(i) (m + n)a = ma + na\ 

(ii) 0 mri)a = m(na)\ 

(iii) m(a + b) = ma + mb ; 

(iv) m{ab) — (ma)b = a(mb), and (; ma)(nb ) = (mn)(ab). 

Proof Trivial. □ 

We do not exclude the possibility: a ring 7? has an identity 1 = 0. If so, then for 
any aeR,a=al=a0 = 0^R has only one element 0. This ring is called the 
zero ring , denoted by {0}. So we assume that any ring with identity contains more 
than one element and this will exclude the possibility that 1=0. 

Example 4.1.1 (i) The triples (Z, +, •), (Q, +, ■)> (R, +, •) and (C, +, •) are all 
commutative rings, where ‘+’ is the usual addition and ‘-’is the usual multipli- 
cation, with the integer 1 as the multiplicative identity element. On the other hand 
the ring 2Z of even integers has no multiplicative identity. 

(ii) Given a ring R , let M 2 {R) denote the set of all 2 x 2 matrices over R. Then 
(M 2 (R), +, •) is a non-commutative ring under usual compositions of matrices in- 
duced by those of R. The matrix ring M n {R ) is defined analogously. 

(iii) Given a non-empty set X , let V{X) denote the power set of X. Then 
(' V{X ), +, •) is a commutative ring, where A + B = AAB = (A \ B) U {B \ A) 
and A B = A D B for VA, B e V(X). Clearly, 0 is the zero element and X is 
the identity element. This ring was introduced by George Boole (1815-1864) as a 
formal notation for assertions in logic (see Ex. 4 of Exercises-I). 

Remark Neither (V(X), U, D) nor (V(X), IT, U) forms a ring. 

(iv) (Z„, +, •) (n > 1) is a commutative ring under usual addition and multipli- 
cation of classes. Clearly, (0) is its zero element and (1) is its identity element. 

(v) Let X be a non-empty set, ( R , +, •) a ring and M the set of all mappings 
from X into R. In M define the pointwise sum and product denoted by / + g and 
/ • g, respectively, of two mappings /, g e M by (/ + g)(v) = f(x) + g(x) and 
(/ * g)W = fW * g(x) Vv eX. Then (M, +, •) is a ring having the zero element of 
the ring the constant map c defined by c(x) = 0 for all x e X and the additive inverse 
— / of / is characterized by the rule (— f){x) = —f(x), for all v e X. Moreover, 
if R has a multiplicative identity 1, M has an identity given by the constant map 
l(x) = lWx eX. 

(vi) If R is a ring, then the opposite ring of R , denoted 7? op , is the ring that has 
the same set of elements as R , the same addition as R , and multiplication ‘o’ is 
given by a o b = ba, where ba is the usual multiplication in R. 

Definition 4.1.2 A non-zero element a in a ring R is said to be a left (right) zero 
divisor iff 3 a non-zero element b e R such that ab = 0 {ba = 0). A zero divisor is 
an element of R which is both a left and a right zero divisor i.e., a zero divisor is an 
element a which divides 0. 
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Example 4.1.2 (i) In the non-commutative ring Af 2 (Z), 

(2 0\ / 0 0\ _ /0 0\ _ / 0 0\ / 2 0\ 

\o oy\o 3/ \o o)-\o 3/\o oy 

shows that ( ^ ) is a zero divisor of ( q q ) • 

Remark The ring M 2 (Z) is an example of a ring having infinitely many zero divi- 
sors. 

(ii) 7u n is a commutative ring with zero divisors for every composite integer 
n > 1; i.e., if n = rs in Z (1 < r, s < n ), then ( r)(s ) = (0) in Z w . 

The existence (or absence) of a zero divisor in a ring can be characterized with 
the help of the cancellation laws for multiplication in the ring. 

Theorem 4.1.3 A ring R has no zero divisors iff it satisfies the cancellation laws 
for multiplication in R; i.e., for a, b e R, ab = ac and ba = ca, where a 0, imply 
b — c. 

Proof Suppose R has no zero divisors. Let ab = ac, a / 0. Then a(b — c) = 0 =>► 
b — c = 0 ^ b = c. Thus ab = ac =>- b = c. Similarly, ba = ca => b = c. 

Conversely, let R satisfy the cancellation laws for multiplication. Suppose ab = 
0, a ^ 0. Then ab = aO =>► b = 0 by cancellation law. Similarly, ab = 0, b / 0 =>► 
a = 0. Consequently, has no zero divisors. □ 

Remark According to Definition 4.1.2, 0 is not a zero divisor. 

If R contains 1 , then 1 is not a zero divisor and even any element of R having a 
multiplicative inverse is not a zero divisor. 

Definition 4.1.3 A commutative ring with 1 which has no zero divisors is called an 
integral domain. 

Note The ring of integers Z is an example of an integral domain, hence the name 
integral domain has been chosen. 

Theorem 4.1.4 A commutative ring R with 1 is an integral domain iff the cancella- 
tion law : 

If a 0, then ab = ac => b = c holds Wa, b,c e R. 

Proof It follows from Theorem 4.1.3 and Definition 4.1.3. □ 

Proposition 4.1.1 (Z n , +, •) ( n > 1) is an integral domain iffn is a prime integer. 

Proof Clearly, Z n is a commutative ring with (0) as its zero element and (1) as its 
identity element. Let n be a prime integer, (a) and ( b ) be two non-zero elements 
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of Z n . We claim that (< a)(b ) / (0). If ( a ) / (0), then n cannot divide a , i.e., n\a. 
Similarly, ( b ) ^ (0) ftjZ?. Thus ( a)(b ) = (0) (ftZ?) = (0) => n\ab => either ft|ft 
or ft|Z? =>► a contradiction. Consequently, (ft)(Z?) / (0). This shows that Z n is an 
integral domain. 

Conversely, let Z n be an integral domain. Suppose n is not a prime integer. Then 
there exist integers p , q such that n = pq, 1 < p < n and 1 < q < n. So, (ft) = 
( pq ) = ( p){q ) = (0) =>► either (p) = (0) or (q) = (0), which is a contradiction. 
Hence ft must be prime. □ 

Definition 4.1.4 Let R be a ring with identity 1. An element a e R is said to left 
{right) invertible iff there exists an element b e R (c e R) such that ba = 1 ( ac = 1) 
and a is said to be a unit or an invertible element iff there exists an element b e R 
such that ab = ba = 1. The element b is then uniquely determined by a , and is 
written a ~ l . 

Thus a unit in R is an element a which divides 1 . Clearly, the units in R form a 
(multiplicative) group denoted by U(R). 

The integral domain Q and the integral domain R enjoy an algebraic advantage 
over the integral domain Z: every equation ax = b (a is not zero) has a unique 
solution in them. Integral domains with this property are called fields. 

Definition 4.1.5 A ring R with 1 is called a division ring or a skew field iff every 
non-zero element of R is invertible. 

Thus a ring R with 1 is a division ring iff its non-zero elements form a (multi- 
plicative) group. 

Proposition 4.1.2 A division ring does not contain any zero divisor. 

Proof Let R be a division ring and ab = 0, a 0. Then a~ l e R and b = lb = 
{a~ l a)b = a~ l (ab) = a~ l 0 = 0. This shows that R does not contain any zero divi- 
sor. □ 

Theorem 4.1.5 A ring R with 1 is a division ring iff each of the equations : ax =b 
and ya = b, has a unique solution in R , where a, b e R, and a ^ 0. 

Proof Let R be a division ring. If b = 0, then v = 0 and y = 0 are solutions of the 
given equations. Since R has no zero divisor, each of the solutions is unique. Next, 
let b 0. As the non-null elements of R form a multiplicative group, the given two 
equations have unique solutions by Proposition 2.3. l(v). 

Conversely, let R be a ring with 1, satisfying the given conditions. Suppose 
a,b e R, where ab = 0, and a / 0. Since the equation ax = 0 has a solution x = 0, 
and the solution is unique, it follows that b — 0. Thus the product of two non-null 
elements of R is non-null. Hence (R \ {0}, •) is a multiplicative semigroup, which is 
a group by using Proposition 2.3.3. Consequently, R is a division ring. □ 
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Example 4.1.3 (i) Quaternion rings. The quaternions of Hamilton essentially due 
to Sir Wiliam R. Hamilton (1805-1865) constitute a 4-dimensional vector space 
over the field of real numbers with its basis {1 ,i,j, k} consisting of four vectors 
(see Chap. 8). Quaternion ring, discovered by Hamilton in 1843 is an important 
example of a division ring. 

To introduce this ring, let H be the set consisting of all ordered 4-tuples of real 
numbers, i.e., 

H = {( a , b , c, d ) : a, b, c, d e R}. 

In H we define + and • by the rules: 

(a, b, c , d) + (x, y,z,t) = (a+x,b + y,c + z,d + t), 

(a, b, c, d) • (x, y, z, t) = (ax — by — cz — dt, ay + bx + ct — dz , 
az — bt + cx + dy, at + bz — cy + dx). 

Then (H, +, •) is a ring with (0, 0, 0, 0) as the zero element, (—a, —b, —c, —d) as 
the negative element of (a, b, c, d) and (1, 0, 0, 0) as the identity element. 

We introduce the notation by taking 

1 = (1,0, 0,0), i = (0,1, 0,0), j = (0,0, 1,0) and £ = (0,0,0, 1). 
Then 1 is the multiplicative identity of H and 

i 2 = j 2 = k 2 = — 1, / • j — k, j • k = /, k • i = j, j • i = —k, 

k • j = —i, i • k = — j. 

These relations show that the commutative law for multiplication fails to hold in H. 
The definitions of algebraic operations ‘+’ and in H show that each element 
(a, b, c,d) e H is of the form 

(a, b, c, d) = (a, 0, 0, 0)1 + (fc, 0, 0, 0)/ + (c, 0, 0, 0) j + (d, 0, 0, 0 )k 
= a T bi T cj dk , 

on replacing (r, 0, 0, 0) by r, as the map / : {(r, 0, 0, 0) : r e R} R, (r, 0, 0, 0) 
r is an isomorphism of rings (see Definition 4.4.1). Thus H = {a + bi + cj + dk : 
a , b,c,d € R}, where i 2 = j 2 = k 2 = —1, i • j =k, j • k = i, k • i = j, k • j = — 
j • i = —k and i - k = — j. 

An element of H is called a real quaternion and // is called a real quaternion 
ring. 

Similarly, the integral quaternion ring or rational quaternion ring can be defined 
by taking integral or rational coefficients, respectively. Note that any non-zero real 
quaternion q =a + bi +cj -\-dk has a conjugate q defined by q = a — bi — cj — dk. 
Then qq = qq = a 1 + b 2 + c 2 + d 2 = N (say) ^0. Hence q~ l = N~ l q is the 
multiplicative inverse of q. Consequently, (H, +, •) is a skew field. 


4.1 Introductory Concepts 


165 


Remark The set of all members of H of the form (a,b, 0, 0) = a + bi, called the 
special quaternions forms a subring isomorphic to C (see Example 4.1.1(i)). Again 
the set of all members of H of the forms (a, 0, b, 0) or ( a , 0, 0, b) forms a subring 
isomorphic to C. 

In the above sense, the real quaternions may be viewed as a suitable generaliza- 
tion of the complex numbers. 

(ii) All square matrices of order 2, given by 

( a\ + < 22 / <23 + < 24 / 

—<23 + < 24 / a\ — a 2 i 

where i 2 = —1 and a r (r = 1, 2, 3, 4) are rational numbers, form a division ring R 
under usual addition and multiplication of matrices. 

A commutative division ring carries a special name called field. It is customary 
to introduce the concept of fields abstractly. 

Definition 4.1.6 A commutative division ring is called a field. 

Thus a field is an additively abelian group such that its non-null elements form a 
multiplicatively commutative group and it satisfies distributive laws. 

Remark The division rings in Example 4.1.3 are not fields. 

Example 4.1.4 (i) (R, +, •), (Q, +, •) and (C, +, •) (cf. Example 4.1.1 (i)) are fields 
(called fields of real numbers, rational numbers and complex numbers, respectively). 

(ii) (R x R, +, •) is a field under ‘+’ and *•’ defined by (< a , b) + (c, d) = (a + 
c,b + d), ( a , b) • (c, d) = (ac — bd, ad + be) V< 2 , b,c,d e R. Clearly (0, 0) is its 
additive zero, ( 1 , 0 ) is its multiplicative identity and the inverse of ( a , b) (^( 0 , 0 )) 
is ( a/ (a 2 + b 2 ), —b/(a 2 + b 2 )). 

Remark The field (R x R, +, •) defined above is isomorphic to the field of complex 
numbers (C, +, •)• 

The concept of abstract field is relatively harder to grasp than that of a subfield 
of the field of complex numbers, but an abstract field contains new classes of fields 
including finite fields. Finite fields form an important class of abstract fields [see 
Sect. 11.2 of Chap. 11]. We are now going to deduce a relation between a finite 
integral domain and a field through the following theorem. 

Theorem 4.1.6 A finite integral domain is a field. 

Proof Let D be an integral domain consisting of n distinct elements. Suppose 
0 / a e D. If b and c e D and b^c, then ab / ac, otherwise ab = ac would imply 
b = c by Theorem 4. 1 .4. Thus if n distinct elements of D are multiplied by a, we ob- 
tain n distinct elements in D, i.e., all the elements of D. Hence for any given b e D, 
there exists a unique element u e D such that au = b. Consequently, the equation 
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ax = b has a unique solution in D , for any pair of elements a, b e D, provided that 
a / 0. Again, since multiplication is commutative in D , the equation ya — b has 
also the very same solution. Consequently, D is a division ring by Theorem 4.1.5. 
Finally, as multiplication is commutative in D, D is a field. □ 

Example 4.1.5 The commutative ring (Z n , +, •) (n > 1) is a field iff n is a prime 
integer. 

Proof It follows from Proposition 4.1.1 and Theorem 4.1.6. □ 

Remark For a given prime integer p (>1), there exists a field (Z p , +, •) denoted as 
GF(p) having p elements, called a finite field or Galois field. 

Remark Let 0 and 1 denote the property of an integer being even or odd, respec- 
tively. Define operations on the symbols 0 and 1 by the tables: 


+ 

0 1 X 

0 1 

0 

0 1 0 

0 0 

1 

10 1 

0 1 


Note that the operations are defined by analogue of the way in which the corre- 
sponding properties of integers behave under usual addition and multiplication. For 
example, since the sum (product) of an even and an odd integer is odd (even), we 
write 


0 + 1 = 1 (0x1 = 0). 

Now ({0, 1}, +, x) is a field with 0 as its zero element and 1 as its identity element. 
We write 0 for 0 and 1 for 1 and the field by GF( 2). Consider an arbitrary non- 
empty set S and the commutative ring R consisting of all maps from S to GF( 2) 
(see Example 4.1.1(v)). Since x 2 = x Vx e GF( 2), this relation also holds in R. 


4.2 Subrings 

There is a natural interest to study a subset of a ring, which is also closed under 
addition, subtraction and multiplication as defined in the ring. We now deal with the 
situation where a subset of a ring again forms a ring. 

Definition 4.2.1 Let (/?,+,•) be a ring and S a non-empty subset of R. If the or- 
dered triple (S, +, •) constitutes a ring (using the induced operations) i.e., (5, +, •) 
is itself a ring under the same compositions of addition and multiplication as in R , 
then (5, +, •) is said to be a subring of (/++,). 

For example, (Z, +, •) is a subring of (R, +, •), (2Z, +, •) is a subring of 

(Z,+,-). 
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Remark To find a simpler criterion for a subring, we note that (S, +, •) is a subring 
of ( R , +, •) iff (S, +) is a subgroup of ( R , +) and (S, •) is a subsemigroup of (R, •), 
as the distributive, commutative and associative laws are hereditary. 

Theorem 4.2.1 A subset S (^0) of a ring R is a subring of R iff' a — b, ab e S for 
every pair of elements a, b e S. 

Proof Suppose the given conditions hold. Since a — beS for every pair of elements 
a, b e S, it follows by Theorem 2.4.1 that ( S , +) is a subgroup of ( R , +) and as + 
defined on R is commutative, (S, +) is abelian. Again (S, •) is a subsemigroup of 
(7?, •) as by hypothesis, the closure property for multiplication holds in S. Associa- 
tive and distributive laws hold in S , since they hold for the whole set R. Therefore 
(S, +, •) is a subring of (/?,+, ■)• Conversely, if (S', +, •) is a subring, then the given 
conditions hold. □ 

Example 4.2.1 Consider 5= {(J J) :i G R}. Then by Theorem 4.2.1, (S', +, •) is 
a subring of (Af 2 (R), +, •) of all 2 x 2 real matrices under usual addition and mul- 
tiplication of matrices. Clearly, is the identity of S but this identity is not the 

same as the identity (* J) of Af 2 (R). 

Remark For a ring R with identity, the identity (if it exists) of a subring of M 2 (R) 
may not be the same as the identity of M 2 (R). 

Theorem 4.2.2 The intersection of any aggregate of subrings of a ring R is a sub- 
ring of R . 

Proof Let M — p| • Si be the intersection of a given family {Si} of subrings of R. 
M/0, for 0 e M. Then a, b e M => a, b e each Si a — b, ab e each Si => 
a — b,abeM^M is a subring of R by Theorem 4.2.1. □ 

Remark The join of two subrings may not be a subring. 

Example 4.2.2 The multiples of 2 and multiples of 3 are subrings of the ring of 
integers. The join of these two subrings contains the elements 2 and 3, but not their 
sum 2 + 3 = 5. Consequently, the join is not a subring. 

Definition 4.2.2 The center of a ring R denoted by cent R is defined by cent R = 
{a e R : ar = ra, Vr e R}. 

Theorem 4.2.3 For any ring R , cent R is a subring of R. 

Proof Clearly, cent R 0, since 0 e cent R. Again a,b e cent R => ar = ra and 
br = rb Vr e R =>> {a — b)r = ar — br = ra — rb = r(a — b) and ( ab)r = a(br) = 
a(rb) = (i ar)b = ( ra)b = r(ab) Vr e R => cent R is a subring of R by Theo- 
rem 4.2.1. □ 
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Remark Every ring R has at least two subrings, viz. {0} and R. These two subrings 
are called trivial subrings of R\ all other subrings (if they exist) are called non- 
trivial. We use the term proper subring to mean a subring which is different from R. 

We now study subrings of a field. 

Theorem 4.2.4 A subring with identity , of a field is an integral domain. 

Proof Let D be a subring of a field F , such that 1 e D. Then D is a commutative 
ring having no zero divisors, since multiplication is commutative in F and F con- 
tains no zero divisors. Consequently, D is an integral domain. □ 

Definition 4.2.3 Let (F, +, •) be a field, A subset H (^0) of F is called a subfield 
of (F, +, •) iff the triple (//,+,•) is itself a field (under the induced operations). 

Lor example, the field of rational numbers Q is a subfield of the field of real 
numbers R. In any field F , F is a subfield of F. 

A formulation or a test for a subfield of a given field is given below: 

Theorem 4.2.5 A non-empty subset H (^{0}) of a field F is a subfield of F iff 
a — b, ab e H for every pair of elements a,b e H and x~ 1 e H Vv e /f\{0}. 

Proof Let the given conditions hold in H. Then ( H , +) forms a group. Moreover 
(i/\{0}, •) forms a multiplicative group. Linally, the commutative properties for ad- 
dition and multiplication and distributive properties, being hereditary, hold in H . 
Consequently, H is a subfield of F. Conversely, the necessity of the conditions is 
obvious. □ 

Theorem 4.2.6 The intersection of a given aggregate {Hi} of subfields of a field F 
is a subfield of F. 

Proof Let M = P I - H be the intersection of the aggregate {HA. Then M is a sub- 
field of F by Theorem 4.2.5. □ 

Definition 4.2.4 The intersection of all subfields of a field F is a subfield, called 
the prime field of F . 

Thus the prime field of F is the smallest subfield of F . 


4.3 Characteristic of a Ring 

Characteristic of a ring is a non-negative integer (may be 0), which is very important 
in the study of ring theory. The characteristic of an integral domain (hence of a field) 
is either 0 or a prime integer 
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Let ( R , +, •) be a ring. For a e R,na already has a meaning in ( R , +) for every 
n e Z. The order of the additive cyclic subgroup, generated by an element a e R 
is called the characteristic of a. Thus a has a non-zero characteristic r iff r is the 
smallest positive integer such that ra = 0. Then for any integer n, na = 0 n is 
divisible by r. If, on the other hand, the additive cyclic subgroup generated by a 
contains infinitely many distinct elements, then a is said to have the characteristic 
infinity or 0; in the case, ra = na => r = n, and ra 0 for all non-zero integers r. 
Different elements may have different characteristics; 0 has always the characteris- 
tic 1. 

Definition 4.3.1 Let R be a ring. If r is the smallest positive integer for which 
ra = 0 g /?, then r is said to be the characteristic of /?; if no such positive 
integer exists, then R is said to be of characteristic infinity or zero. We write char/? 
for the characteristic of R. 

Example 4.3.1 (i) The ring (Zis, +, •) has characteristic 18 and the characteristics 
of (6), (9), (2), (4) are, respectively, 3, 2, 9, 9. 

(ii) Clearly, charZ is 0. 

Remark Definition 4.3.1 makes an assertion about every element of the ring but 
char R of a ring R with 1 is completely determined by 1 . This is given below. 

Theorem 4.3.1 Let R he a ring with identity 1. Then R has characteristic n > 0 iff 
n is also the characteristic of 1 . 

Proof char/? = n (>0) =4na = 0 Va e /? =^nl =0. Now ml = 0 for some m 
satisfying 0 < m < n => ma = m(la) = (ml)a = 0a = 0 Va e R => char/? < n => 
an impossibility => n is the characteristic of 1 . The converse is similar. □ 

Theorem 4.3.2 Let R be a ring with identity 1 . If R has no zero divisor ; then char R 
is either 0 or a prime integer. 

Proof Let char/? = n/ 0. Assume that n is not prime. Then n has a non-trivial 
factorization n = rs , with 1 < r, s < n. Now 0 = nl = l + l + • • • (rs terms) = 
[1 + 1 + • • • (r terms)][l + 1 + • • • (s terms)] = rl • si. But rl ^ 0, si / 0, since 
1 < r, s <n. Then rl and si are zero divisors, which is impossible by hypothesis. 
We, therefore, conclude that n must be prime. □ 

Corollary 1 Let D be an integral domain. Then char D is 0 or a prime number. 

Corollary 2 Let L be afield. Then char L is 0 or a prime number. 

Example 4.3.2 (i) The field of rational numbers, the field of real numbers and the 
field complex numbers have each the characteristic 0 as the integer 1 has the same 
characteristic. 
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(ii) charGF(/?) is p, as the characteristic of its identity (1) is p. 

(iii) Let F be a finite field. Then char F is a prime integer. 

[Hint, (iii) Since every field is an integral domain, char F is either 0 or a prime p. 
Suppose char F = 0. Then nl / 0 for every positive integer n and nl ^ ml, where 
n^m. Hence {nl : n is a positive integer} is an infinite subset of F. This contradicts 
that F is a finite field. Consequently, char F is a prime integer.] 

(iv) If an integral domain (£),+,•) has a non-zero characteristic p , then p is the 
period (order) of every non-zero element in the group (D, +). 

[Hint, (iv) Let a e Z)\{0}. Then pa = a + a + • • • (p terms) = [1 + 

1 H 1 (p terms ) ]a = (pi) a = 0a = 0 =>► period of a is a factor of p =>► period of 

a is 1 or p , since p is prime =>- period of a is p as it is not 1, since a / 0.] 

(v) If an integral domain (D, +, •) has characteristic 0, then in the group (D, +), 
every non-zero element has infinite period (order). 

[Hint. Let a e D\{0}. Then for every positive integer n,na = ( nl)a . But 0; 
and, since char D = 0, nl / 0. Again since D is an integral domain, na ^ 0 for 
every positive integer n i.e., a has infinite period in the group ( D , +).] 


4.4 Embedding and Extension for Rings 

It is sometimes convenient to study a ring as a subring of a suitable ring having ad- 
ditional properties. This motivates to introduce the concept of an embedding, which 
is a monomorphism. A homomorphism of a ring is similar to a homomorphism of 
a group. Consequently, the same terminology, such as, homomorphism, monomor- 
phism, epimorphism, isomorphism, automorphism, etc., is also used for rings. The 
real significance of homomorphisms, homomorphic images, kernels of homomor- 
phisms, and their basic properties are discussed in detail in this section. 

The concept of embedding and extension can be introduced for rings in precisely 
the same sense as in the case of groups. 

Definition 4.4.1 Let R and S be rings. A (ring) homomorphism from R to S means 
a mapping / : R -> S such that f(a + b) = f(a) + f(b ), f(ab ) = f(a ) • f(b) for 
every pair of elements a, b e R. 

Ring homomorphisms of special character are named like special group homo- 
morphisms: 

(i) an injective ring homomorphism is called a monomorphism ; 

(ii) a surjective ring homomorphism is called an epimorphism ; 

(iii) a bijective ring homomorphism is called an isomorphism ; 

(iv) a ring isomorphism of a ring onto itself is called an automorphism ; 

(v) a ring homomorphism of a ring into itself is called an endomorphism. 

The existence of a ring isomorphism from R onto S asserts that R and S, in a 
significant sense, are the same; i.e., R and S , have the same algebraic structures and 
we say that R and S are isomorphic and write R = S. 
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Let / : R — > S be a ring homomorphism. For the time being, if we forget the 
multiplications defined on R and S , then / is considered a group homomorphism 
from the additive group (/?,+) to the additive group (5, +). Then it follows: 

Proposition 4.4.1 If f : R ^ S is a ring homomorphism , 

(i) /(0) = O' ( where 0 and O' are the zero elements of R and S , respectively ); 

(ii) Va e /?, /(-a) = -/(a); 

(iii) / w injective , zj/ker / = {0}, where ker / zx the kernel off defined by ker / = 
{a e R : /(a) = 0'}. 

(iv) if f is onto and both R and S contain identity elements 1 r and 1^, respectively , 
then /( l R ) = Is- 

To obtain other information about ring homomorphisms and their kernels, we 
now bring back into our discussion, the multiplications defined on the rings con- 
cerned. 

Theorem 4.4.1 Let f : R —> S be a ring homomorphism ( R , S being rings). Then 

(i) for each subring A of R, /(A) is a subring of S’, 

(ii) for each subring B of S, f~ l (B) is a subring of R; 

(iii) ker / is a subring of R. 

Proof (i) Let A be a subring of R. Since A / 0, /(A) / 0. Let xjg /(A). Then 
x = f(a) and y = f(b) for some a,b e A. Hence x — y = f(a) — f(b) = f(a) + 
f(—b) (by Proposition 4.4.1) = f (a — b) (since / is a homomorphism) e /(A), 
since a — b e A. Similarly, xy = f{a)f(b) = f(ab) e /(A), since ab e A. Conse- 
quently, /(A) is a subring of S by Theorem 4.2.1. 

(ii) Proof is similar to that of (i). 

(iii) Since 0 e ker/, ker / 0. Let x, y e ker/. Then /(x) = f(y) = Cf (zero 

element of 5) =* /(x - y) = /(x) + f(-y) = /(x) - f(y) = O' - O' = O' and 
f(xy) = /(x)/(y) = O'O 7 = 0 r =>► both x — y and xy e ker / ker / is a subring 
of R by Theorem 4.2.1. □ 

Remark The kernel of a ring homomorphism / : R — > S describes how far / from 
being 1 — 1. If ker / = {0}, then / is 1 — 1; two elements of R are sent by / to the 
same element of S ^ their difference is in ker /. 

Simple Examples of Ring Homomorphisms 

Example 4.4.1 The most trivial examples are the identity homomorphisms. Let R 
be a ring, and let I : R — > R be the identity function. Then I is a ring homomorphism 
which is 1 — 1 and onto and hence is an isomorphism. 

Example 4.4.2 Let S be a ring and let R be a subring of S. Then the inclusion map 
i : R ^ S is a ring homomorphism. In particular, the inclusion map i : Q R is a 
ring homomorphism. 
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Remark The only difference between the identity function I and i in Exam- 
ples 4.4.1 and 4.4.2 is their ranges. If R C S, the function i : R — > S defined by 
i (x) = x in Example 4.4.2 is a homomorphism but not an epimorphism. 

Example 4.4.3 Let R be a commutative ring with 1 of prime characteristic p (> 1). 
Then the map 0 : R R defined by 


<p(a) =a p , Va e R 


is a homomorphism (. Frobenius homomorphism). 

Example 4.4.4 If F is a finite field of prime characteristic p, then the homomor- 
phism 0 : E — > F, a \-^ a p is an automorphism. 

[Hint. ker0 = {0} =>► 0 is a monomorphism. Again F is finite =>► 0 is surjective 
by counting.] 

Example 4.4.5 Let E be a field of a prime characteristic p. Give an example to 
show that the homomorphism $ \ F ^ F , a \-^ a p , may not be an automorphism 
if F is not finite. 

[Hint. Consider the field F = Z p (x), the field of rational functions over the 
field Z p , which is the quotient field of Z p [x\, where x is an indeterminate (see 
Definition 4.5.5). Then 0(F) = Z p (x p ) is a proper subfield of Z p (x) =>► 0 is not an 
automorphism.] 

Example 4.4.6 Let F be a field. Determine all the automorphisms of F(x). 

[Hint. Every homomorphism / : F (x) — > F (x) mapping x onto some y = 
(ax + b)/(cx + d) e F(x), a, b, c, d e F and ad — be ^ 0 is an automorphism and 
conversely, each such y e F (x) gives rise to an automorphism of F(x) mapping x 
into y.] 

Definition 4.4.2 A ring R is said to be embedded in a ring S (or S is said to be an 
extension of R) iff there exists a monomorphism / : R — > S from the ring R to the 
ring S. 

Remark If / is a monomorphism, then f : R f(R)( C S) is an isomorphism and 
/ is called an embedding or an identification map. So we can identify R and f(R) 
and consider R a subring of S. 


Lor example, R is considered as a subfield of C under identification map / : 
R -> C, r (r, 0). 

Example 4.4.7 Consider the mapping / : ( Z, +, •) — ► (M 2 (Z), +, •) defined by 



WaeZ. 
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Then f(a + b) = f(a) + f(b) and f(ab) = f(a)f(b) Va, b e Z =>► / is a ring ho- 
momorphism. Clearly, / is a monomorphism. Consequently, (Z, +, •) is embedded 
in (M 2 (Z), +, •)• 

Theorem 4.4.2 (Dorroh Extension Theorem) Let R be a ring without identity ele- 
ment. Then R can be embedded in a ring S with identity element. 

Proof Let R be a given ring without the identity element and S = Z x R. Define ‘+’ 
and in S by ( n , a) + (m, b) = (n + m, a + b) and ( n , a) • (m, Z?) = (m, nb + 
ma + ab), where nb denotes the nth natural multiple of b and similarly for ma. 
Under these operations S is a ring with the identity element (1,0). Define a mapping 
f:R^Z x R by f(a) = (0, a)Va e R. Then / is a homomorphism. Now ker / = 
{a e R : f(a) = (0,0/?)} = {0/?} =>► / is a monomorphism by Proposition 4.4.1 
(where Or is the zero element of R) => R is embedded in Z x /?. □ 

Theorem 4.4.3 Let R be a ring such that char R =n (>0). Then R can be embed- 
ded in a ring S with identity element such that char S = n. 

Proof Let S = Z n x R. Define ‘+’ and in S by ((m), a) + ((f), b) = ((m + t ), 
a + Z?) and ((m), a)((t), b) = ((, mt),mb + ta + tfZ?). Then S' forms a ring with 
((1),0) as the identity element. Since zz((l), 0) = (n( 1),0) = ((ft),0) = ((0),0), 
and char Z n = n, it follows that char S — n. 

Consider the mapping / : R — > Z n x R by f(a) = ((0 ),a). Clearly, / is a 
monomorphism. □ 

Corollary 1 R be a ring without identity element. Then R can be embedded in 
a ring S with identity element (with preservation of characteristic of R). 

Proof If char/? = 0, then by Theorem 4.4.2, R is embedded in Z x R = S, where 
charts = 0, as (1,0) has infinite order in (S, +). Again if char/? = n (n > 0), then 
by Theorem 4.4.3, R is embedded in Z n x R with the requisite properties. □ 

Corollary 2 Let R be a ring without identity. Then R can be extended to a ring 
with identity (with the preservation of characteristic of R). 

Remark If R is a ring with identity element, then an extension S of R may have a 
different identity element and in such cases, char S / char /?. 

Example 4.4.8 (Zio, +, •) is a ring with (1) as its identity. If S = {(0), (2), (4), (6), 
(8)} C Zio, then (S, +, •) is also a ring, having (6) as its identity. Thus Zio is an 
extension of S , but they have different identities. 

Remark A ring with zero divisors cannot be extended to an integral domain or a 
field, otherwise, we would reach a contradiction (by Theorem 4.2.4). But in case 
of an integral domain an extension to a field is possible. Lor example, the standard 
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integral domain Z is extended to the field of rationals Q. The same technique is used 
to prove the following theorem for an arbitrary integral domain. 

Theorem 4.4.4 Let D be an integral domain. Then D can be extended to afield. 

Proof Let T = D x D\{0} = {(x, y) : x, y g D, y / 0}. Define a binary rela- 
tion p on T by (x, y)p(a, b) iff xb = ay. For any (x,y) g T , xy = yx => 
(x, y)p(x, y) => p is reflexive. Again (x, y)p(a, b) =>► xb = ay =>► ay = xb =>► 
(a, fe)p(x, y) => p is symmetric. Finally, (x, y)p(a, b) and (a , b) p (c , d) => xb = ay 
and ad = cb ^ xbd = ayd and ady = cby xbd = cby (as multiplication is 
commutative in D) =>► (xd)b = ( cy)b => xd = cy (by cancellation law, since 
b ^ 0) =>► (x, y)p(c,d) p is transitive. Consequently, p is an equivalence re- 
lation on T . Let Q(D ) = T/p, the set of all p -equivalence classes on T . Define 
addition ‘+’ and multiplication in Q(D) by 

(i) ((a, b)) + ((c, d)) = ((ad + be, bd)) and 

(ii) ((a, fc))((c, d)) = ((ac, fed)). 

As fe / 0, d / 0 and D is an integral domain, fed / 0, so Q(D) is closed with 
respect to both ‘+’ and clearly, the definitions of ‘+’ and 4 -’ in (i) and (ii) are 
independent of the choice of the representatives of the classes. Then it is easy to 
verify that in Q(D ), 

(1) addition ‘+’ and multiplication are commutative and associative; 

(2) ((0, a)) g Q(D ) is independent of the choice of a e D\{0}, as ((0, a)) = ((0, fe)) 
and ((0, a)) is the additive identity; 

(3) ((—a, fe)) is the additive inverse of ((a, fe)); 

(4) distributive laws hold; 

(5) for any a e D\{0}, ((a, a)) is independent of a and is the multiplicative identity; 

(6) for ((a, fe)) G Q(D ) and ((a, fe)) not being the zero element in Q(D) (this means 
a / 0), the multiplicative inverse of ((a, fe)) is ((fe, a)). 

Consequently, ( Q(D ), +, •) is a field. We now consider the map / : D Q(D) 
by f{x) = ((ax, a)). Note that ((ax, a)) is independent of the choice of a e D\{0}. 
Then / is a homomorphism. Now ker / = (xeD:/(x)=0 / } (where Cf is the zero 
element of Q(D )) = {x g D : ((ax, a)) = ((0, a))} = {0} (as ax = 0 =>► x = 0, since 
a ^ 0) =>> / is a monomorphism. Consequently, 2(D) is an extension of D. □ 

Remark 1 The elements of Q(D ) are denoted by ((a,fe)) = afe -1 (or a/fe) and 
2(D) is called the quotient field of D (or field of quotients). 

Remark 2 For any integral domain D, the quotient field is unique up to isomor- 
phism. (For proof see Corollary of Theorem 4.4.5 or Theorem 4.4.6.) 

Theorem 4.4.5 The quotient fields of two isomorphic integral domains are isomor- 
phic. 
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Proof Let D and C be isomorphic integral domains and ft : D — >► C an isomor- 
phism. 

Consider the map / : Q(D ) 2(C) defined by 

/(<ta ,_1 ) = /3(d) (0 (</')) _1 , where (c/, t/') e D x D\{0}. 

Let cc' -1 e g(C), where (c, c')eC x C\{0}. 

Then c = /3(d) and c' = /3(d') for some (< d , d') e D x D\{0}. So, / is surjective. 
Let (d, d'), (a, a') e D x D\{0}. 

Then 


dd ' 1 = aa 1 


da' = ad' f(da')=f(ad') 

O /3(d)/3(a')=/3(a)l3(d') 

& £( d )(£( d ')) _1 =£( a )(£( a ')) _1 

f{dd '~ l ) = f{aa '~ l ) =>► /is injective. 


Consequently, / is bijective. 

Now 

f(dd r ~ l + = f((da' + ad')(d'a f ) *) = /3(da + ad') (/3 (d' a)) 1 

= (P(d)0(a') + mP{d')){p{d')(p(a'))- X 

= P (d) (P (d') ) ~ 1 + p {a) (p (a ') ) “ 1 

= f(dd '~ l ) + f(aa'~ l ). 

Similarly, f(dd'~ l -aa'~ l ) = f(dd'~ l )f(aa'~ l ). 

Consequently, / is an isomorphism. □ 


Corollary For an integral domain D , quotient field Q(D) is unique (up to iso- 

morphism). 


Proof In Theorem 4.4.5 take D = C and /3 the identity isomorphism on D. □ 


Let us now study some properties of the quotient field Q(D). 

Theorem 4.4.6 Q(D) is the intersection of all fields , which contain D as a subring. 

Proof Let F be any field containing D as a subring. Then F contains the solutions 
of the equations xb = a for all a, b e F, where b / 0. Hence the quotients ab~ l 
of all pairs (a, b) e D x D\{0} belong to F => Q(D) c F. Also, Q(D) is itself 
a field containing D as a subring. Consequently, Q(D) is the intersection of all 
fields, which contains D as a subring. □ 
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Corollary Let F be any field containing an integral domain D as a subring. Then F 
contains Q(D ) as a subfield. 

Remark (i) Q(D ) is the smallest field containing D as a subring. 

(ii) Theorem 4.4.6 also proves the uniqueness of the quotient field of a given 
integral domain. 

(iii) If D is a field, then Q(D) = D. 

Problem 1 Find the quotient field of the ring of integers Z. 

Solution Let T = Z x Z\{0}. Then the equivalence relation p on T (defined in 
Theorem 4.4.4) identifies different pairs, whose ratios are the same. For example, 
(5, 8), (—10, —16), (40, 64) etc. are p-equivalent and hence these pairs constitute a 
class. As Z is an integral domain, the elements of the quotient field Q( Z) of Z are 
abstractly the same as these classes, and are called rational numbers ; every rational 
number is of the form mn~ l , where m, n are integers and n 0. Consequently, 
Q(Z) = Q, the field of rational numbers. 

Remark Let F be a field. Then by Corollary 1 of Theorem 4.3.2, char F = 0 or p 
(a prime number). We show that in each case F has a subfield isomorphic to a well- 
known field. 

Theorem 4.4.7 Let F be afield. 

(a) //char F = 0, then F contains a subfield K such that K is isomorphic to Q. 

(b) If char/ 7 = p, a prime integer ; then F contains a subfield K such that K is 
isomorphic to Z p . 

Proof Define a map / : Z — > F,«^nl,VneZ. Then / is a ring homomorphism. 

(a) Suppose char F = 0. Then ker / = {n e Z : n 1 = 0} = {0} => f is a monomor- 
phism. 

Define a map / : Q -> F by 

f(m/n) = Vm/n e Q. 

Clearly, / is well defined and is a monomorphism. 

Hence the field Q is isomorphic to the subfield /( Q) of F =>► (a), (taking 

K = /( Q)). 

(b) Suppose char F = p > 0. Since / is a ring homomorphism, it follows that 

the rings Z/ ker / and Im / = /( Z) are isomorphic (see Theorem 5.2.3 of Chap. 5). 
Now char F = p > 0 =>► /(Z) {0} =>- /( Z) is a non-trivial subring with 1, of the 

field F => /( Z) is an integral domain =>- Z/ker / is an integral domain =>► ker / is 
a prime ideal of Z (see Theorem 5.3.1 of Chap. 5) =>> ker / = (q) for some prime 
integer q > 0 (see Problem 1 of Chap. 5). Now q e ker / =^/(g) = 0=^gl=0=^ 
p\q ^ p = q Z/ker / = Z/{p) = Z p =>► /(Z) = Z^ =>► (&). □ 

Consequently, we have the following theorem. 
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Theorem 4.4.8 In every field F , there is a minimal subfield P (i.e., a subfield P 
contained in every subfield of F ), P being isomorphic to Q or Zi p according as 
char F = 0 or p. 

Remark 1 A field F cannot have more than one such minimal subfield: for, if H and 
G are both minimal subfields of the field F , then G C H and H c G. Consequently 
G = H. The unique minimal subfield in a field F is the prime subfield of F . 

Remark 2 It follows from Theorem 4.4.8 that the prime subfield of F is the inter- 
section of all subfields of F and it is the smallest subfield of F . 

Remark 3 Theorem 4.4.7 shows that the prime subfield of F is the subfield of F 
generated by the identity element of F . 


4.5 Power Series Rings and Polynomial Rings 

We are familiar with polynomials and power series in one variable with integral 
coefficients. We extend these notions having coefficients from an arbitrary ring R. 
This motivates to define polynomial ring 7?[v] and power series ring R{xj in one 
variable. An element of R[x] is a formal symbol ao + a\x + • • • + a n x n , where 

ai ’s are in R and that of R [v] is a formal expression of the form ao + a\x 4 + 

a n x n + • • • , where afi s are in R. In this section we also study the influence of the 
structure of R on the structure of R[x] and R{x}. The polynomial ring in n variables 
is defined recursively (see Ex. 29 of Exercises-I). 

Let (/?,+,•) be a ring. Consider the collection P of all infinite sequences 

/ = (a 0 , ai,a 2 ,...,a n ,...) = {a n } 

of elements a n e R. The elements of P are called formal power series , or simply 
power series over R . 

Two power series / = (ao, a\, a 2 , . . . , a n , . . .) and g = (bo, b\, b 2 , . . . , b n , . . .) 
are said to be equal, denoted by / = g iff a n = b n Vn > 0. Define ‘+’ and in P 
as follows: 


/ + g — (ao + bo,a\ + b \, . . . , a n + b n , . . .), 
fg = (c 0 ,ci,c 2 ,...), 
where, for each n > 0, c n is given by 

c n = aibj = aob n + a\b n —\ + • • • + a n —\b\ + a n bo- 

{i+j=n) 


Then P is closed with respect to ‘+’ and \ Moreover, (P, +) is an abelian group. 
Let f,g,h€ P, where f = {a n }, g = {b n } and h = {c n }. 
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Then (f-g) = {d n }, where d n = Y, {i+j=n) afij. 

So, 


(f ■ g) ■ h = {e n }, 

where 

e n = ^2 d r c s = 22 ( 22 aibj\c s = ^2 ( aibj)c s 

( r+s=n ) (r+s=n)^(i+j=r) ' ( i+j+s=n ) 

= 22 a i(.bj c s)= 22 a,, P' where g ■ h = {t n } 

{ i+j+s=n ) (i+p=n) 

= f -(g-h). 

Similarly, we can show that 

fig + h) = ((2 0 , <21, . . • , a n , . . .)(b 0 + c 0 , b\ + ci, . . .) = (eo, e \,.. .), 

where 

e n = ^'(^2 + c j)= (^bj +aiCj)= ^ fljfcy 4- ^ fljCy 

{i+j=n) {i+j=n) {i+j=n) {i+j=n) 

= / •£ + / -h. 

This shows that the left distributive property is also satisfied in (P, +, •)• Similarly, 
the right distributive property is satisfied in (P, +, •)• Thus (P, +, •) is a ring with 
0 = (0, 0, 0, . . .) as the zero element of this ring and the additive inverse of an arbi- 
trary element (<2o, a\, < 22 , . . .) of P is (—<20, — a\, —< 22 , . . .)• 

Then we have the following. 

Theorem 4.5.1 The triple (P, +, •) forms a ring , called the ring of (formal) power 
series over R. Moreover ; the ring (P, +, •) is commutative , iff R is commutative. 

Let S = {(a, 0, 0, . . .) : a e R}. Then ( S , +, •) is a subring of (P, +, •) such 
that S is isomorphic to the ring R (under the isomorphism S — >► R, given by 
(a, 0, 0, . . .) 1 — > a). In this sense, P contains the original ring R as a subring, and 
thus, an element a e R is identified with the special sequence (<2, 0, 0, . . .) e P. 
Consequently, the elements of R regarded as power series, are hereafter called con- 
stant series , or just constants. 

We use the symbol ax n , n > 1, to designate the sequence (0, . . . , 0, <2, 0, . . .) e P, 
where the element a occupies the (n + l)th place in this sequence. For example, 

ax = (0, <2, 0, 0, . . .), ax 2 = (0, 0, <2,0,.. .), 
ax 3 = (0, 0, 0, <2,0 ,.. .) and so on. 
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Then each power series f = {a n } £ P can be expressed uniquely in the form: 

f = (do, 0, . . .) + (0, a \ , 0, ...) + ••• + (0, . . . , 0, a n ,...) + •• • 

= ao T &\x T d2 % ^ + • • • + d n x n T • • • 

(with the identification of do e R with the sequence (do, 0, 0, . . .) e P ). Thus P 
consists of all formal expressions: 

f = do T d\x T d2X 2 T • • • T d n x n H , 


where the elements at (called the coefficients of f) e R and we write / = 
Y^=o a r xr or simply / = ^fd r x r . Then the definitions of addition ‘+’ and multi- 
plication *•’ in P assume the form: 

^ ~^a n x n + ^2b n x n = +b n )x n , ( 5^ a„ x n ) ( b n x n ) = y ~^c n x n , 

where c n = Y.«+j=n) a < b .i = £"i= 0) «/*»-<• 

Here x is simply a symbol (called an indetermindte ), not related to the ring R 
and in no sense representing an element of R. The monomials x r are considered 

independent in the sense that a nX n = bnX n iff 0/ = fy for all f =0, 1,2, 

Following the usual convention, we write R{x] for P and f(x) or simply / for 
any element of R{x}. 

Remdrk If 1 e R, then we can identify the power series (0, 1, 0, 0, . . .) with x, 
(0, 0, 1, 0, . . .) with x 2 , (0, 0, 0, 1, 0, . . .) with v 3 and so on. Thus dx is an actual 
product of elements of R{xJ defined by dx = (d, 0, 0, . . .)(0, 1, 0, 0, . . .) and so on. 

Following the convention we omit terms with zero coefficients and replace 
(—d n )x n by —d n x n . A formal power series f(x) = ^ffdiX l e /?[x] is said to be 
a zero power series iff at = 0 V/, otherwise it is said to be a non-zero power series. 

Definition 4.5.1 If f(x) = ^fd r x r is a non-zero power series in R{xJ, then the 
smallest integer n , such that d n / 0 is called the order of f(x) and is denoted by 
ord / (or simply 0(f)). 

Theorem 4.5.2 If R is dn integml domdin , then /?[*] is dlso dn integml domdin. 

Proof Clearly, R{x} is a commutative ring with identity. We claim that R{xJ con- 
tains no zero divisors. Let f(x) 0, g(x) 0 in R{x}, with O(f) = n and 
0(g) = in, so that 


f (x) = a n x n + a n + ix n+1 -\ (a n / 0), 

g(x) = b m x m + b m+ \x m+l H (b m ^ 0). 
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Then 

f{x)g(x) = a n b m x n+m + (a n+l b m + a n b m+ i)x n+m+1 4 . 

As R does not contain zero divisors, a n b m / 0. Hence the product f(x)g(x) cannot 
be a zero power series. Hence R{x] is an integral domain. □ 

Lemma 4.5.1 Let R be a ring with 1. The element f(x) = ^fa n x n in /?[jc] is in- 
vertible in 7?[v] iff the constant term ao is invertible in R. 

Proof Suppose f{x) is invertible in R{xJ. Then there exists g(x) = bnX n c R{x ] 
such that f{x)g{x) = g(x)f(x) = 1. Now aobo = 1 = boao =>► ao is invertible in 
R^bo — a^ 1 . 

Conversely, suppose ao is invertible in R. Then there exists bo £ R such that 
aobo = boao = 1. We define inductively the coefficients of power series ^fffb n x n in 
R{xJ which is the inverse of f(x). To start with, we determine the coefficient b n 
so that f(x)g(x) = 1 i.e., g(x) is a right inverse of f(x). From f{x)g{x) = 1, 
we obtain from the definition of multiplication of power series that if the follow- 
ing equations are satisfied by the coefficients of g(x), then g(x) is a right inverse 
of f(x ): 


(i) 

II 

O 

o 

^3 



(ii) 

II 

o 

+ 

o 

^3 

0 


(iii) 

aob2 + a\b\ + 

a 2 bo 

= 0 

(iv) 

aob n T a\b n —i 

+ ... 

+ a n bo = 0 

(v) 

ttnbn+\ d\bn 


■ + a n b\ + a n +\bo = 0 


Since ao is invertible, bo is uniquely determined as off 1 . Then b\ is uniquely deter- 
mined from equation (ii) above. Suppose bo,b\, . . . ,b n have already been defined. 
Then b n +\ is uniquely determined by equation (v) and so on. So every b r is deter- 
mined uniquely by induction, such that f(x)g(x) = 1. This shows that f(x) has a 
right inverse g(x). 

Similarly, it can be proved that f(x) has a left inverse. Hence f(x) is invertible 
in Rpc} (since in any ring an element with a right inverse and a left inverse has an 
inverse). □ 

Theorem 4.5.3 (i) A power series f(x) = ^fa n x n in F[jc], where F is a division 
ring , is invertible in F[v] iff its constant term ao is non-zero. 

(ii) If F is a division ring , then the invertible elements of F\x\ are precisely those 
power series with non-zero constant term. 

Proof (i) and (ii) follow from Lemma 4.5.1. □ 

Let R be a field. Then R{x} is an integral domain by Theorem 4.5.2. Let R{{x)) 
be the quotient field of R{x}. 
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Theorem 4.5.4 Every non-zero element of R((x)) can be uniquely expressed in the 
form : 

x r (<2o + a\x + a 2 X 2 4 ) , where at e R and a$ 0. 

Proof Any element / of R{(x)) can be written as 

_ bo + b\x -\-b 2 x 2 H 

x s (c s +c s +\x H ) ’ 

where s is the least integer such that c s 0. Since c s / 0, c s + c? + ix H has an 

inverse do + d\x + d 2 * 2 H in R{xJ. 

Therefore, 

(bo H - H - + • • • )(do H - H - d 2 X 2 + • • • ) 
x s (c s + Cy+ix H )(Jo + ^ix -\-d 2 x 2 H ) 

(20 + (2lX + (22V 2 H _ v/ 2 

— — x (ao + a\x T a 2 X T • • • 

= aox~ s + (2ix _,s+1 + a2X~ s+2 H . 

Thus R((x)) consists of formal power series with a finite number of terms with 
negative exponents: 

00 

/=£ a^x^ , where A is an integer positive, negative or zero. 

n=N 

If ajv ^ 0, N is called the order of f and is as usual denoted by ord (/) (or 
by 0(f)). □ 

Corollary If R is afield , R {(x )) A aA# afield. 

Proof If follows from Theorem 4.5.4 and Lemma 4.5.1. □ 

Remark R((x)) (sometimes, simply denoted by R(x)) called the field of formal 
power series in x over the field R. 

We now consider a particular class of power series. 

Definition 4.5.2 Let R be a ring and R[x] the set of all power series in R{xj whose 
coefficients are 0 from some indEx. onwards (the indEx. varies from series to series). 
Then R[x] is called a polynomial ring (in the indeterminate x) over the ring R. The 
polynomial 0 such that at = 0 V/ is called zero polynomial; otherwise, a polynomial 
is called a non-zero polynomial. 
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Remark Let R be a ring and R[x] polynomial ring over R. If polynomials f(x) = 
J2 a r x r , g(x) = J2b r x r are in 7?[x], with a r = 0 Vr > n and b r = 0 Vr > m, then 

a r +& r =0 Vr>max(ftz,ft) and 



Vr >m + n. 


(i+j=r) 


These imply that both the sum f(x) + g(x) and the product /(x)g(x) are in R[x]. 
Clearly, R[x] forms a subring of R{x} and is called the ring of polynomials over R 
(in x). 

Definition 4.5.3 Given a non-zero polynomial 


fix) = ao + a\x H V a n x n (a n ^ 0) in R[x], 


we call a n the leading coefficient of /(x); the integer n , the degree of /(x); and 
write deg f = n. 

Remark The degree of any non-zero polynomial is a non-negative integer; no degree 
is given to the zero polynomial 0. The polynomials of degree 0 are precisely the non- 
zero constant polynomials. 

Definition 4.5.4 If R is a ring with identity 1, a polynomial f(x) (c/?[x]) whose 
leading coefficient is 1 is said to be a monic polynomial. 

Proposition 4.5.1 If R is an integral domain , then /?[x] is also an integral domain. 

Proof It follows from Theorem 4.5.2. □ 

Definition 4.5.5 If R is a field, the quotient field R (x) of R [x] is called the field of 
rational functions over the field R . 

Proposition 4.5.2 Let R be a ring and f(x),g(x)eR[x]. If deg f = n, degg = m, 
(m, n > 0), then 

deg(/ + g) < max (ft, m) and deg fg<m+n. 

If R is an integral domain , deg(/g) =m + ft and the converse is also true. 

Proof Let 


fix) = ao + a\x H V a n x n (a n 0) and 

g(x) = b 0 + b\x H b b m x m ( b m / 0) 


(A) 


be polynomials in R[x]. 
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If c r is the coefficient of x r in f(x)g(x ), then c r = ^Z i+ j =r afij. The first part 
follows from Remark of Definition 4.5.2. Next suppose R is an integral domain. 
In (A), a n ^ 0 and b m ^ 0 =>► c m+n = J2i+j= m +n = a n^m ± 0 as R is an inte- 
gral domain => deg(/g) = m+n. □ 

Let R be a ring with identity, S an extension of R , and .v be an arbitrary element 
of S. Then for each polynomial fix) = ao 4- a\x + • • • + a n x n in R[x], define 
f(s) = ao + a\s 4 b a n s n e S. 

Then the element f(s) is called the result of substituting s for v in fix). 

Suppose fix), g(x) e R[v] and r e cents'. If h(x) = fix) + g(x) and p(x) = 
f(x)g(x ), then clearly h(r) = f(r) + g(r) and p(r) = f(r)g(r). Consequently, the 
map a r : R[x] S, cr r (f(x)) = f(r ) is a homomorphism. a r is called a substitution 
homomorphism induced by r and its range, denoted by R[r], is thus 

R[r] = {f(r):f(x)eR[x]} 

= {ao + a\r -\ + a n r n : a,- eS:n>0}. 

Then R[r] forms a subring of S and R[r] is generated by the set R U {r }, (since 
1 e R ^ lx = x e R[x] =4rG R[r]). Evidently, R[r] = R => r e R. 

We claim that a r is unique. If p is a substitution homomorphism induced by r, 
then p(ai) = at for each coefficient at and p(x l ) = r l . Since p is a homomorphism, 

p(f(x)) = p(ao) + p(a\)p(x)-\ \-p(a n )p(x n ) = ao+air-\ Ya n r n = fir) = 

a r (f(x)) V fix) e R[x] =>► p = a r . Thus the following theorem follows: 

Theorem 4.5.5 Let R be a ring with identity , S an extension of R , and the ele- 
ment r e cents. Then there is a unique homomorphism G r : R[x] — > S such that 
G r (x) = r , G r (a) = a Va e R . 


Let R be a field and R(x) the field of formal power series in x and t e R(x). 
Then by substitution xh^l,we get a homomorphism g : R(x) — > R(t) defined by 

( OO \ 00 

^2 a i x 1 | = (4.1) 

i=N / i=N 

However, if ord(0 = r, then g is called a substitution of order r. 

Theorem 4.5.6 Every substitution x \ t of order 1 is an automorphism of R{x) 
over R. 

Proof Let t = ax r H El?(x),fl/0,r> 1, so that ord(0 = r > 1. 

Let f{x) = ajsfX N + a^ + \x N+l H e R(x), a n ^ 0. 

Then fix) 0, and fit) = a^iax r + • • • ) N + a^j r \{ax r + • • • ) iV+1 + • • • = 

a^a N x rN + • • • and as a^a N / 0; then by (4.1) we have Gif it)) / 0. Thus a is 

an isomorphism of R(x) onto R(t). We shall now show that if r = 1, then g is an 
automorphism. For this we have to prove that R(t) = R(x). This will follow, if we 
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can find a sequence of elements a\ , < 22 , . . . in R such that a maps a\ x + a^x 2 4 

onto v. In this case, a\ (ax 4 ) + a 2 (ax 4 ) 2 4 =x. 

Comparing the coefficients of x, x 2 , x 3 , . . . , x l from both sides, we have 

a\a = 1, 

a^a 4- a\(- • • ) =0, 

4“ 02(* ■ ■ ) + o \ (• • • ) = 0, 


did* 4“ ai — 1 (• • • ) 4“ • • • 4“ a\ (• • • ) — 0. 

and so on, where the omitted expressions are obtained from the coefficients of 

a\ x 4 , and hence they are known. From these relations we can solve a \ , < 22 , . . . , 

successively. This completes the proof. □ 


4.6 Rings of Continuous Functions 

We now consider another class of important rings, such as rings of functions, in 
particular, rings of continuous functions. 

Definition 4.6.1 If X is a non-empty set and R is a ring, then the set S of all 
(set) functions / : X -> R is a ring under pointwise addition ‘+’ and pointwise 
multiplication 4 • ’ of functions defined by 

(f + g)(x) = f(x) + g(x ) ,) 

\ V/, g e S and Vx e X. 

(f-8)(x) = f(x)-g(x) \ ' ' 

This ring S is called a ring of functions. 

Remark This ring S is commutative if and only if R is commutative. S has an iden- 
tity (which is a constant function if and only if R has an identity). 

Proposition 4.6.1 Let S(X, R) be the ring of all mappings from a non-empty set X 
to the ring R. Then given a ring T, every ring epimorphism k : R — > T induces 
a ring homomorphism \j/ = k* : S(X , R) — > S(X , T) such that if R and T have 
identity elements , then f sends identity element ofS(X, R) into the identity element 
of S(X, T). 

Proof Define f = k* : S(X, R) -> S(X, T) by f(f) = k o /, for all / e S(X, R). 
Clearly, x/r(f +g)=ko(f + g)=kof + kog = ifr(f) + i jf(g) and x/r(f ■ g) = 
ko(f-g) = (kof)-(kog) = +(f) ■ f(g), for all /,ge S(X, R). 
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Suppose R and T have identity elements. Let c : X — > R be the constant map 
defined by c(x ) = 1 r (identity element of R), for all v e X and d : X — > T be 
the constant map defined by d(x) = 1 j (identity element of T ), for all x e X. 
Then c and d are, respectively, the identity elements of S(X, R) and S(X, T). Now 
= k(lR) = 1 t (since k is an epimorphism) = d{x ), for all igI Thus 
\jr{c) = d, proves the second part. □ 

We now study the particular ring when X is a topological space and R is the 
field R, the field of real numbers. We now define the ring of real- valued continuous 
functions on X , denoted by C(X). 


Definition 4.6.2 Let X be a topological space and C (X) be the set of all continuous 
real-valued functions on X. Then C(X) is a commutative ring under point wise 
operation ‘+’ and defined by 


(/ + *)(*) = /(*) + *(*), 1 
(/■*)(*) = /(*)■*(*) } 


V/, g e C(X) and V* e X. 


This ring is called the ring of real-valued continuous functions on X. 


Proposition 4.6.2 The ring C ( X ) contains the identity element c : C ( X ) — > R de- 
fined hy c(f ) = l for all f e C(X). Further ifty : C(X) -> R is a non-zero homo- 
morphism , it is an epimorphism such that \//(tc) = t for all t e R. 


Proof As x)/ is a non-zero homomorphism there exists an element f eC (X) such 
that VK/) 7 *- 0. Then VK/) = V^(/ • c ) = V r (/)V r ( c ) implies that ^r(c) = 1. Pro- 
ceeding as in Problem 2, we prove that fi(nc) = n for each integer n, i/f (rc) = r for 
each rational number r and fi(xc) = x for each irrational number x. Consequently, 
if is an epimorphism such that \j/(ct) = t for all t e R. □ 

Remark 1 Let X and Y be two homeomorphic spaces. Then C(X) and C(Y) are 
isomorphic as rings (see Ex. 1 1 of the Exercises in Appendix B). 

Remark 2 Let X and Y be two compact Hausdorff spaces. If C(X) and C(Y) are 
isomorphic as rings, then X and Y are homeomorphic. It follows from a theorem of 
Gelfand-Kolmogoroff (Dugundji 1980, p. 289). 

We now draw attention to the special ring C([0, 1]). 

Proposition 4.6.3 Let C([0, 1]) be the ring of all real-valued continuous functions 
on [0, 1]. Then C([0, 1]) is a commutative ring with identity having zero divisor. 

Proof Clearly C([0, 1]) is a commutative ring with identity. Consider a non-zero 
function / e C([0, 1]) defined by 
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/(*) = 


0 , 

x — 


l 

2 ’ 


if0<x < 

if \ <* < I- 


Let g e C([0, 1]) be defined by 


g(x ) : 


1 _ x 

2 

o, 


if0<x < ±, 

if A < x < 1 . 


Then g is a non-zero function in C([0, 1]) such that g is a zero divisor of /. □ 

The ring C([0, 1]), sometimes written as C([0, 1]), has the following properties. 


Proposition 4.6.4 The ring C ([0, 1]) has neither non-trivial nilpotent elements nor 
non-trivial idempotent elements. 


Proof If possible, there exists a non-trivial nilpotent element / E C([0, 1]). Then 
( f( x )) n = 0 f° r some positive integer n. Thus /(x) = 0, for all x E [0, 1], which 
implies a contradiction. 

Next suppose that g is a non-trivial idempotent element in C([0, 1]). Then 
(gC*)) 2 = g(x) for all x E [0, 1] implies g(x)(g(x) — 1) = 0 for all x E [0, 1]. 
This implies Img = {0, 1}. Since the continuous image of [0,1] under g must be 
a connected subset of R which is necessarily an interval of R. Hence we reach a 
contradiction as {0, 1} C R is not an interval. □ 


In Chap. 5, we shall study ideals of the ring C([0, 1]). 


4.7 Endomorphism Ring 

We now define another important ring with an eye to the additive group of integers. 
Given an additive abelian group A, there exists a ring, called the endomorphism ring 
of A, having some interesting properties. 

Definition 4.7.1 Let A be an additive abelian group and End(A) the set of all endo- 
morphisms of A. Then (End(A), +, •) is a non-commutative ring (in general) with 
identity l a : A — > A under usual ‘+’ and defined by (/ + g)(x) = /(x) + g(x) 
and (/ • g){x) = (/ o g)(x) = /(g(x)), V/, g E End(A) and Vx E A. End(A) is 
called endomorphism ring of A. 

We now study the special ring End(Z). 

Proposition 4.7.1 The two rings Z and End(Z) are isomorphic. 

Proof Define the mapping \j/ : Z — >► End(Z) by x//(n) = f n , where f n : Z ^ Z is 
defined by f n (x) = nx, Vx E Z. We claim that x/r is a ring isomorphism. 
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x/r is a ring homomorphism : xl/(m-\-n) = f m + n : Z —> Z is defined by / w+n (x) — 
(m + n)x = mx + nx = f m (x) + f„(x) = (f m + f„)(x), Vx e Z =i> f m+n = 
fm + A =4> + n) = xj/fn) + ^r(w), Vm, n eZ. Similarly, xlr(mn) = x[r(m)xl/(n), 

Vm, n eZ. Hence xj/ is a ring homomorphism. 

x/r is injective: Let ^(m) = xj/(n). Then f m = f n =$> f m (x) = / n (x) Vi eZ=)> 
mx =nxWxeZ^m = n (by taking in particular x = 1) =>► x/r is injective. 

xj/ is surjective : Let / e End(Z). Then / : (Z, +) — > (Z, +) is a group homomor- 
phism. 1 e Z =>► /(l) eZ=)> /(l) = n for some n e Z ^ f n (x) = nx = /( l)x = 
x/(l) = /(x • 1) = /(x), Vx g Z =>► /„ = / => = / =>► xfr is surjective. Con- 
sequently, xj/ is a ring isomorphism. □ 

Remark In general End(A) is non-commutative. For example, consider the free 
abelian group (A, +) = (Z ® Z, +). Let f,ge End(A) be defined on the gener- 
ators (1,0) and (0, 1) of the group A as follows: 

/( 1,0) = (1,0) and / (0, 1) = (1, 0); 


and 


Then 


g(l,0) = (0,0) and g(0, 1) = (0, 1). 


(/ ° g)(n, m) = f(g{n, mj) = /( 0, m) = ( m , 0) 
and 

(g ° f)(n, m) = g(f(n, mj) = g(n +m, 0) = (0, 0) 
for all (n, m) e A =>► f o g / g o /. 

Any ring with identity can be embedded in an endomorphism ring. 

Theorem 4.7.1 Let R be a ring with 1. Then R is isomorphic to a subring of 
End (R). 


Proof Define the mapping xjr : R End(7?) by x/s(a) = f a for all a e R, where 
f a : R R is defined by f a (x) = ax, for all x e R. Clearly, f a is an endomor- 
phism of the abelian group R by the distributive property in R . Moreover, as before, 
x/r is a ring homomorphism. Now ker x// = {a e R : f a = 0} = {a e R : ax = 0 Vx e 
R} = {0}. Hence ^ is a monomorphism. Consequently, R is isomorphic to a subring 
of End (R). □ 


Corollary Any ring R can be embedded in the ring End (R). 

Proof It follows from Theorems 4.4.2 and 4.7.1. □ 


We now find the endomorphism ring of any finite cyclic group of prime order. 
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Proposition 4.7.2 The two rings EndCZ^) and Z p are isomorphic. 

Proof Define x/r : EndCZ^) Z p by ^(/) = /(( 1)), where / : Z p -> Z p is a 
group homomorphism. Clearly, if/ is well defined and an isomorphism. □ 


4.8 Problems and Supplementary Examples 

Problem 1 Let (/?,+,•) be a ring and d an element of 7? which is not a left zero 
divisor. Show that the characteristic of R is finite iff the order of d in the additive 
group ( R , +) is finite. Moreover, char R = m iff order of d in (/?,+) = m. 

[Hint. Let ( R , +, •) be a ring and d be not a left zero divisor. Suppose ( R , +, •) 
is of finite characteristic. Then 3 a positive integer m > 1 such that mr = 0 Vr e /?. 
Then md = 0 and hence d has a finite order in (/?,+). 

Conversely, let d have finite order in (/?,+). Then 3 a positive integer m > 1 
such that md = 0. Let v be an arbitrary element of R. Then 0 = ( md)x = d(mx) (by 
distributive property) =>► mx = 0 (as d is not a left zero divisor) Vv e R => char R is 
finite. 

Next let char R = m. Then md = 0 =>- order of d is a divisor of m. If possible, 
order of d = r, 1 < r < m. Now rd = 0 =>► for any x e /?, 0 = (; rd)x = d(rx) =4> 
r v = 0 (as d is not a left zero divisor). =>► char R = r < m =>► a contradiction. 

Conversely, if order of d in (/?,+) is m, then md = 0 and rd^0ifl<r<m. 
Now, for any x e /?, mx = 0 =>► char/? < m. If 1 < r < m, then rd 0 =>► 
char/? > m. Consequently, char/? = m.] 

Problem 2 Let R be the field of reals and / :R^Ra ring homomorphism such 
that /( 1) = 1. Show that f(x)=x for all x e R and / is an automorphism. 

[Hint. /( 1) = 1 =► f(n) = /(I + 1 + ••• + 1) = /( 1) + /( 1) + ••• + 
/( 1) (n times) = nf{ 1) =nV integers h > 0. 

If the integer n < 0, /(n) = —nf{— 1) = w/(l) = n. 

Thus f(n) = nV integers n => f is the identity map on the set of integers. 

Let £ be a rational number such that gcd(/?, g) = 1. Now 

P P P 

p = — I 1 1 — (p times) 

q q q 

(since /(/?) = p, as p is an integer, positive or negative) =>► / is the identity map 
on the set of rationals. 

Let a and h be real numbers such that a > b. Then a — b > 0 and hence 3 a real 
number c such that c 2 = a — b. Then f(a — b) = f(c 2 ) = ( f(c )) 2 (as / is a ho- 
momorphism) > 0 => f (a) — f(b) >0. Thus a > b =>- f(a) > f(b) =>- / is strictly 
monotonic. 


\qj 

f(p\np} = p 
\q J q q 
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Let x be any irrational number. Then 3 rational numbers a n and b n such that 
a n < x <b n and a n — b n -> 0 as n -> oo. 

So, f(a n ) < f(x) < f(b n ) (as / preserves ordering) =>► a n < f(x) < b n (as / 
is identity on the set of rationals) =>> \x — f(x) \ < \a n — b n \ ^ x — f(x) -> 0 =>► 
/(*)=*.] 

Problem 3 Show that there is at most one isomorphism under which an arbitrary 
ring R is isomorphic to the ring Z. 

[Hint. Suppose /, g : R -> Z are isomorphisms of rings. Then g~ l : Z -> R is 
also an isomorphism. 

Hence f o g~ l : Z ^ Z is necessarily a homomorphism. 

Then / o g~ l = identity homomorphism on Z (by Problem 2) =>► f = g.] 

Problem 4 Let F be an arbitrary field. Show that F[x] is not a field. 

[Hint. Let f(x) e F[x] be such that deg/(x) > 0. Suppose 3 a non-zero poly- 
nomial g(x) e F[x] such that f(x)g(x) = 1. 

Then deg[/(jt)g(jt)] = deg/(jc) + degg(x) > 0 => a contradiction, since 
deg 1=0=^ F[x] is not a field.] 

Remark The only invertible elements of F[x] are constant polynomials excepting 
zero polynomial. 

Problem 5 Let D be an integral domain and K be its quotient field. Show that the 
field of rational functions D(x) over D is the same as the field of rational functions 
over K . 

Problem 6 A finite division ring is commutative. This result is known as Wedder- 
burn theorem on finite division rings. 

[Hint. See Sect. 7.2, Herstein (1964).] 

Problems and Supplementary Examples (SE-I) 

1 Let F be a finite field of characteristic p (p is prime). Let / : F — >► F be the map 
defined by f(x) = x p . Then / is an automorphism of F. 

Solution Wa, b e F, (a + b) p = a p + p Cl a p ~ l b H f p Cr a p ~ r b r 4 b b p (see 

Ex. 12 of Exercises-I) and (ab) p = a p b p . Since p divides the binomial coeffi- 
cient p Cr for 1 < r < p — 1, (a + b) p = a p + b p (see Ex. 14 of Exercises-I). Con- 
sequently, f(ab) = f(a)f(b) and f{a + b) = f(a) + f(b) Va, b e F =>- / is a 
homomorphism. Now ker / = {0} =>► / is a monomorphism. As F is finite, / is an 
epimorphism. Consequently, / is an automorphism of F. 

2 Let F be a field. Then the groups (F — {0}, •) and (F, +) cannot be isomorphic. 

Solution Let / : (F — {0}, •) — >► (F, +) be an isomorphism. Then we claim that 
characteristic of F is 2. Otherwise, 0 = /( 1) = /((— 1) • (—1)) = /(— 1) + 
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/(— 1) ^ 0 by Theorem 4.3.1 a contradiction. Let a £ F — {0}. Then f{a 2 ) = 
f(a • a) = /(a) + f(a) = 0 = /(l) =>► a 2 = 1 as / is a monomorphism =4> a = 1 
(or —1) in this case \F\ = 2 ^ \F — {0}| = 1. Thus / is a bijection from a set of 
order 1 to a set of order 2 => a contradiction. 

3 Let 7? be a ring such that | R \ > 1 and for each non-zero x e R there is a unique 
y £ R satisfying xyx = x . Then 

(i) R has no zero divisors; 

(ii) yxy = y; 

(iii) R has an identity; 

(iv) R is a division ring. 

4 Let / : R — > D be a ring homomorphism, where R is a ring with identity 1 and 
D is an integral domain with identity Y . If ker / ^ 7?, then /(l) = L. 

Solution Suppose /(l) = 0. Then Vx £ 7?, /(x) = /(x 1) = /(x)/(l) = 0 =>► 
ker / = 7? => a contradiction => /(l) / 0. Now L/(l) = /( 1) = /(I • 1) = 
/(1)/(1)=>/(1) = 1'. 

5 Let / : 7? — > S' be a ring homomorphism, where R is a field and ly = l / :^0inS. 
Then / is a monomorphism. 

Solution Let a / 0 be in R. We claim that f(a) / 0. Since 7? is a field, a has an 
inverse a -1 in 7?. Hence L = /(l) = f{aa~ x ) = /(a) • /(a -1 ). If f(a) = 0, then 
L = 0 f(a~ l ) = 0, it is a contradiction, since Y / 0 in S. Thus ker / contains no 
element of R except 0 and hence / is a monomorphism. 

6 If R and S are two rings, their direct sum denoted by R ® S is the ring consisting 
of the pairs (x, y ) with x e R and y e S, with addition and multiplication given by 

(xi, y\) + (x 2 , J2) = (*1 + X 2 , yi + J2), 

(xi, yi) • (x 2 , y2) = Ui-^2, yij2)- 

The direct sum of any number of rings is defined in a similar way. 

(a) R © S is not an integral domain, since Vx £ R, Vy £ S', (x, 0) • (0, y) = (0, 0), 
which is the zero element of R ® S. 

(b) The direct sum S = 7?®7?©---®7?of n -copies of R can be viewed as the 
ring of functions on a set of n elements viz. {1,2 , ,n} with values in R ; the 
element (xi , X2 , . . . , x n ) £ R © R © • • • © R can be identified with the function / 
given by f(i) = x; . Addition and multiplication of functions are given as usual 
by operating on their values in R. Clearly, S is a ring. 

(c) If i £ N + , then P = © Ri can viewed as the set of all infinite sequences 

(xi , X2 , . . . , x n , . . .) such that x/ £ Ri for each i £ N + . Clearly, P is a ring. 
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7 Let R be a ring isomorphic to a subring A of a ring T . Then there exists an 
extension S of the ring R such that S is isomorphic to T . 

Proof Construct S by S = R U (T — A). As R is isomorphic to A, 3 an isomorphism 
/ : R — > A. Hence we can define a bijective map g : S -> T given by 

g(u) = /(«), if w e 

= w , if w e T — A = S' — R. 

Define addition + and multiplication • on S by 

M + u = g -1 (g(w)+g(i;)) and uv = g _1 (g(u)g(v)), Vu,veS (4.2) 

Hence g(u + v) = g(w) + g(n) and g(uv) = g(w)g(n) Vw, v e S =>► g is a homo- 
morphism. Consequently, g is an isomorphism. Hence S' is a ring. To show that R 
is a subring of S, take u,v e R. Then f(u) = g(w), /(u) = g(v), f(u) + f(v) = 
g(u) + g(v) and f(u)f(v) = g(u)g(v) =^u + v = / _ 1 (/(m) + f(v)) = the sum 
of u and v as originally defined in R and uv = /~' (f(u) f(v)) = the product of u 
and v as originally defined in R. 

Hence the original composition + and • defined in R are the restrictions of the 
corresponding compositions defined in S by (4.2). Consequently, R is a subring of 
S and so S is an extension of R. □ 

8 A field F is said to be perfect iff either char F is 0 or its Frobenius homomor- 
phism (see Example (iii) or Ex. 15 of Exercises-I) is an epimorphism. Then any 
finite field is perfect. 

9 Let R be a ring and the elements a, b, c,d e R satisfy the condition 

ci b = c d\ 

> => either a = c. b = d or a = d, b — c. 

a • b = c • d J 

Then R does not contain any divisor of zero. 

Solution If possible, let R contains divisor of zero. Then there exist non-zero ele- 
ments a, b e R such that a - b = 0. Now by hypothesis 

— E Z? = (zz — E Z?) — E 0 j 

> => either a — a + b, b = 0 or a = 0, b = a-\-b. 

a • b = (a + b) • 0 J 

In both cases, either a = 0 or b = 0, a contradiction. Hence, R does not contain any 
divisor of zero. 

10 A ring R with 1 is a division ring iff for each v (f=-\ ) e R Ely e R such that 
x -by = xy. 
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Solution Let for each x (^1 ) e R, 3y e R such that x + y = xy. Let a (^0) e R. 
Let a = 1. Then a is the inverse of itself. Next let a ^ 1. Now 1—^/1 (since 
a zfz 0). Then by hypothesis, Ely e R such that (1 — a) + y = (1 — a)y 1 -a-\-y = 
y — ay^\—a — —ay =>► 1 = <z(l — y) ^ a has a right inverse in R. Then it follows 
that a has also a left inverse in R. Thus every non-zero element of R has an inverse in 
R => R is a division ring. Conversely, let R be a division ring. Let x (^1) £ R. Then 
x — 1 / 0 =>► its inverse (x — l) -1 exists in R =>► x(x — 1) _1 = y (say) e 7?. Now, 
(x T y) (x — 1) = x (x — 1) T x (x — l) _1 (x — 1) = x 2 and (xy)(x — 1) = x • x(x — 
l) -1 (x - 1) =x 2 => (x + y)(x - 1) = (xy)(x - 1) => (x + y)(x - l)(x - 1) _1 = 
(xy)(x l)(x 1) _1 x + y = xy. 

11 Find the non-trivial ring homomorphisms from (Z, +, •) to (Q, +, •)• 

Solution Let / : Z Q be a non-trivial ring homomorphism. Now /(l) e Q =>► 
either /(l) = 0 or /(l) / 0. If /( 1) = 0, then /(x) = /(x • 1) = /(x) • /( 1) = 0, 
VxeZ=^/ = 0=^ a contradiction, since / / 0 by hypothesis =>► /(l) / 0. 
Let /( 1) = /?/<?, p 0, q > 0. Then /(l) = /(I • 1) = /(1)/(1) =► p/q = 
p/q • p/q =4> p/q = 1 =>► p = g =>* /(l) = 1 /(ft) = n, Vn g Z by Prob- 

lem 2. 

12 Let F be a field of characteristic (^0). Then 3 only one group homomor- 
phism: / : (F, +) -> (F - {0}, •)• 

Solution Let /, g : (F, +) — > (F — {0}, •) be two group homomorphisms. Then 

/(0) = 1 = g(0). 

Again 

Vx e F, j9x = 0 by hypothesis 

=► f(px) = m=g(o)=g( P x) 

=► (/w) p = (*w) p 
=► (/w)'-(«w) p = 0. 

If » is an odd prime, then (f(x)) p — (g(x)) p = 0 =>• ( f(x)) p + (—l) p (g(x)) p = 
0 =► (/(x))P + ((-l)g(x)F = 0 =► (f(x) + (~l)g(x)) p = 0 by Ex. 1 of SE-I =» 
(fix) - g(x)) p = 0 => fix) - g(x) = 0 /(x) = g(x) Vx e F =y f = g. 

If p = 2, then (/(x)) 2 — (g(x)) 2 = 0 (fix)) 2 + (g(x)) 2 = 0, (since ch F = 2, 

a = — a Va E F) (f(x) + g(x)) 2 = 0, since ch F = 2 =>> /(x) + g(x) = 0 Vx G 
F =4> /(x) = — g(x) = g(x) Vx E F, since ch F = 2. 

Thus /(x) = g(x) Vx E F / = g 3 only one group homomorphism / : 
(F, +) — > (F — {0}, •)• 


13 Every finite subgroup of the multiplicative group G = (F — {0}, •) of a field F 
is cyclic. 
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Solution Use the structure theorem of a finitely generated abelian group to show that 

G = Z n ^ © Z W2 © ' ' ' © Z n t i 


where n\\ri 2 \ ■ • • \n t . 

We claim that t = 1. If t > 1, let a prime p be a divisor of hi . The cyclic group 
Z ni has p elements of order p and similarly, Z n2 has p elements of order p. Hence 
G has at least p 2 elements of order p. All the elements of order p in G are the roots 
of the polynomial x p — 1 over the field F . But it cannot have more than p roots. 
Hence the contradiction implies that t = 1. This shows that G is cyclic. 


4.8.1 Exercises 

Exercises-I 

1. Let Z [i\ = {a + bi : a, b are integers and i 2 = — 1}. Show that Z [/] forms a ring 
under the compositions ( a + bi) + (c + di) = (a + c) + (Z? + d)/, (a + bi ) (c + 
di) = (ac — bd) + (ad + bc)i. 

Z[i ] is called Gaussian ring in honor of C F Gauss (1777-1855). For more 
result on Z [/] see Chap. 11. 

2. Let G be an (additive) abelian group. Define an operation in G by ab = 0 
Va,b e G. Show that (G, +, •) is a ring. 

3. Let F denote the set of all real-valued functions, defined on the closed interval 
I = [0, 1]. Show that F forms a ring under the compositions (/ + g)(x) = 
f(x) + g(x), (/ • g)(x) = f(x)g(x), V/, g € F and x € I (F is called ring of 
functions ). 

4. A ring B with identity in which a 2 = a Va e B (i.e., in which every element is 
idempotent) is called a Boolean ring. Show that 

(i) in a Boolean ring F,a + a = 0VaeF; 

(ii) a Boolean ring is commutative; 

(iii) the power set V(A) of a given set A forms a Boolean ring under the com- 
positions: 

X + Y = (X — Y) U (Y — X) and XY = XFY, VIJgT(A). 

5. An element a in ring R satisfying a n = 0 for some positive integer n is called a 
nilpotent element. Show that 

(i) a nilpotent element is a zero divisor (unless R = {0}) but its converse is not 
true in general; 

(ii) the residue class ring Z n contains nilpotent elements iff n has a square 
factor. 

6. Show that any ring (R, +, •) in which a + b = ab Va, b e R is the zero ring. 

7. If the set A contains more than one element, prove that every non-empty proper 
subset of A is a zero divisor in the ring V(A) [see Exercise 4(iii)]. 
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8. Let R be a ring with identity element 1 and without zero divisors. Show that for 
CL , b £ R 

(i) ab = 1 ba — 1 ; 

(ii) a 2 = 1 either a — 1 or a = —\. 

9. Find the number of ring homomorphisms from the ring Z 12 into Z 28 . 

10. Let G = {Xi : i e 1} be any multiplicative group and R any commutative ring 
with identity. Consider the set R(G) of all formal sums J2iei a i x i f° r a i G ^ 
and jcj G G, where all but a finite number of a\ s are 0. Two such expressions 
are said to be equal iff they have the same coefficients. Define + and • in R(G) 
by 



+ ( ^ ' bj Xj j — ^ ' ( a i H - bi )xi 
' iel 2 iel 


and 




c i x i > 


where c; = J^ajbk and summation is extended over all subscripts j and k for 
which XjXk = Xi . Show that (R(G), +, •) forms a ring (R(G) is called the group 
ring of G over R). 

11. (a) Show that a ring R is commutative iff (a + b) 2 = a 2 + lab + b 2 Va , b e . 
(b) Let 7? be a ring with identity and cent 7? the center of 7?. Then prove that 

(i) if x n+l — x n g cent RWx e R and for a fixed positive integer then R 
is commutative; 

(ii) if x m — x n G cent R Vx G cent 7? and for fixed relatively prime positive 
integers m and ft, one of which is even, then R is commutative. 

12. Let R be a commutative ring with 1 . Prove by induction the binomial expansion: 
(a + b) n =a n + n C\a n ~ l b + • • • + n C r a n ~ r b r + • • • + b n for a, b g R and for 
every positive integer ft, where n C r has the usual meaning. 

13. Let R be a commutative ring. If the elements a and b of R are nilpotent, prove 
that (< a + b) is also nilpotent. Show that the result may not be true if R is not 
commutative. 

14. For any two elements a, b in a field of positive prime characteristic p , show that 
(a±b) p =a p ±b p . 

15. Let R be a field of positive characteristic p. Show that 

ffl jfi ffi 

(i) Vft, b G R, (a + b) p = a p + b p , where m is a positive integer; 

(ii) the map R ^ R given by r r p is a homomorphism of rings (called the 
Frobenius homomorphism), which is a monomorphism but not an isomor- 
phism in general. 

16. Let R be a commutative ring with 1. If v is a nilpotent element of R , show that 
1 + v is a unit in R. Deduce that the sum of a nilpotent element and a unit is a 
unit. 

[Hint. (1 +v) -1 = 1 — v +v 2 4 b (— l) n_1 v n_1 , where x n =0.] 
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17. Show that the commutative property of addition is a consequence of other prop- 
erties of a ring if 

(i) R contains an identity element 1 or 

(ii) R contains at least one element, which is not a divisor of zero. 

18. Let D be an integral domain and a, b e D. If m and n are relatively prime 
positive integers such that a n = b n and a m =b m . Show that a = b. 

19. Let R be a ring. Prove that for any subset X (^0) of R , the set (X) = 
p| { S : X c S; S is subring of /?} is the smallest (in the sense of inclusion) sub- 
ring of R to contain X. ((X) is called the subring generated by X.) 

20. Let R be a ring with identity and S a subring of R. Suppose a g S and (S, a) is 
the subring generated by the set S U {a}. If a e cent R, show that 

(S, a) = \ r i al • n is a positive integer; a e S > . 

I i= o J 

21. Show that every ring with 1 can be embedded in a ring of endomorphisms of 
some additive abelian group. 

22. Let M be the ring of all real matrices of the form {“ b b ) and C the field of 

complex numbers. Define / : C — > M by f {a + ib) = ( b a ). Show that / is 
an isomorphism of rings and so M is a field. 

23. Let H be the division ring of real quaternions and R the division ring defined 
in Example 4.1.3(ii). Show that R is isomorphic to H . 

[Hint. Define an isomorphism / : H — >► R given by l \-> (^), i \-> (q_°-)> 

24. Let R be a ring. Prove that the following conditions are equivalent. 

(i) R has no non-zero nilpotent elements. 

(ii) If a e R and a 2 = 0, then a = 0. 

25. Prove that in a finite field every element can be expressed as a sum of two 
squares. 

26. In this exercise a ring means a commutative ring with identity. Let A be 
a ring / {0}. Show that A is a field every homomorphism of A into a non- 
zero ring B is a monomorphism. 

27. Let L be a lattice in which sup and inf of two elements a and b are denoted by 
a V b and a A b, respectively. L is a Boolean lattice (or Boolean algebra) iff 

(i) L has a least element and a greatest element (denoted by 0 and 1 , respec- 
tively); 

(ii) each of V and A is distributive over the other; 

(iii) each a e L has a unique complement a' e L such that a v a' = 1 and 
a A a' = 0. 


(For example, V(X), the set of all subsets of X , ordered by inclusion is a 
Boolean lattice.) 
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(a) Let L be a Boolean lattice. Define addition and multiplication in L by the 
rules 

a + b = (a A b') V (a! A b ) and ab = a Ab. 

Show that L becomes a Boolean ring (Exercise 4). 

Conversely, let A be a Boolean ring. Define an ordering on A as follows: 
a < b => a = ab. 

Show that (A, <) is a Boolean lattice. 

Remark The sup and inf are given by avb = a + b + ab and a Ab = ab and 
the complement a' = 1 — a. 

(b) Show that there is a one-to-one correspondence between (isomorphism 
classes of) Boolean rings and (isomorphism classes of) Boolean lattices. 

28. We consider only rings with identity. Let B be a ring and A be a subring of B 
such that 1 e A. An element x e B is said to be integral over A, iff v satisfies 
an equation of the form: 

x n + a\x n ~ l H \-a n = 0, where the at ’s are elements of A. 

If every element of v e B is integral over A, then B is said to be integral over A. 
Prove the following: 

Let A c B be an integral domain and B be integral over A. Then B is a 
field O A is a field. 

29. Polynomials in several variables ( indeterminate s ). Let R be a ring and R\ = 
7?[vi] be the polynomial ring over R in variable x\. Consider the polynomial 
ring R\[x 2 \ over R\ , where the variable is X 2 . Then R[x\, X 2 \ = R\[x 2 ] consists 
of all polynomials: 

(y>*0 + (y^ci,*i)x2 h — i- (y^c m -*[)* 2 , 

n = 0, 1, 2, ... , where summation is finite in each case. , X 2 ] is called the 
polynomial ring over R in the two variables x\ and X 2 . The concept can now 
be generalized. The polynomial ring R[x\, X 2 , . . . , x n \ over R in the n variables 
x\, X 2 , • • • , x n is defined recursively by R[x\,X 2 , . . . , x n \ = /?«-i[jc w ], where 
R n _i = /?[xi , X 2 , . . . , x n -\] for n = 2, 3, . . . and 7?o = Clearly, the polyno- 
mial ring remains the same by any permutation of the variables. 

Thus the ring R[x X 2 , • •• , x n ] is the set of all functions / : N n — ► R such 
that f(u) ^ 0 for almost a finite number of elements u of N n for each positive 
integer n together with addition and multiplication defined by 

(f + g)(u) = f(u)+g(u) and ( fg)(u)= ^ f(v)g(w), 

l ) + w=U 

v,we N n 


where /, g e R[x i,X 2 , , x n ] and u e N n . 
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Prove that 

(i) if R is a commutative ring with identity or a ring without zero divisor or an 
integral domain, then so is R[x\, X 2 , •••, x n ] m , 

(ii) the map R ^ R[x i,X 2 , . . . ,x n ] given by r i-> f r , where f r (0, 0, . . . , 0) = r 
and f r (u) = 0 for all other u e N n , is a monomorphism of rings. 

Remark R is identified with its isomorphic image under the map (ii) and hence 
R is considered as a subring of R[x\, X 2 , x n ]. 

30. Topological rings. A set R is said to be a topological ring iff 

(i) R is a ring; 

(ii) R is a topological space; 

(iii) the algebraic operations defined on R are continuous. 

In the event that R is a division ring it is said to be a topological division 
ring iff in addition the following condition is satisfied: 

(iv) for any a (^0) e R and for any arbitrary neighborhood W of a~ l , 
3 a neighborhood U of a satisfying the condition U~ l C W. 

A commutative topological division ring is called a topological field. 

(a) Show that (i) the field of real numbers, the field of complex numbers, 
the field of power series over the field of residue classes Z p are topo- 
logical fields. 

(ii) The division ring of quaternions is a topological division ring. 

(b) (Pontryagin 1939) Prove that a continuous algebraic division ring (i.e., 
a locally compact non-discrete topological division ring), if connected 
and locally compact, is isomorphic either to the real field, or to the 
complex field or to the quaternion division ring. 

(c) Using (b) prove that every connected locally compact division ring is 
isomorphic either to the field of real numbers or to the field of complex 
numbers or to the division ring of quaternions. 

Exercises A (Objective Type) Identify the correct alternative(s) (may be more 
than one) from the following list: 

1. Let R = {[ a c ~J > ) : a,b G R}. Then under usual matrix addition and usual mul- 
tiplication 

(a) R is a non-commutative ring. 

(b) R is a commutative ring but not a field. 

(c) R is a field. 

(d) R is an integral domain but not a field. 

2. Let R = (Z„ , +, •) be the ring of integers modulo n. If n is a composite number, 
then 

(a) R is an integral domain. 

(b) R is a commutative ring but not an integral domain. 
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(c) R is a field. 

(d) R is an integral domain but not a field. 

3. Let R = (M 2 (R), +, •) be the ring of real matrices of order 2. If S = {(* jj) : 
x £ R}, then 

(a) S is a subring of R having no multiplicative identity. 

(b) S is a subring of R having multiplicative identity 1 s such that 1 s = 1 r, 
where 1r is the multiplicative identity element of R. 

(c) S is a subring of R such that ly / 1 r. 

(d) S is not a subring of R. 

4. Let (Z, +, •) be the ring of integers and / : Z -> Z be a ring homomorphism 
such that /( 1) = 1. Then 

(a) / is a monomorphism but not an epimorphism. 

(b) / is an epimorphism but not a monomorphism. 

(c) / is an automorphism. 

(d) / is neither a monomorphism nor an epimorphism. 

5. Let (Q, +, •) be the field of rational numbers and / : Q — > Q be a ring homo- 
morphism such that /( 1)^0. Then 

(a) / is a monomorphism but not an epimorphism. 

(b) / is an epimorphism but not a monomorphism. 

(c) / is an isomorphism. 

(d) / is neither a monomorphism nor an epimorphism. 

6. The number of automorphisms of the field of real numbers is 

(a) one (b) infinitely many (c) two (d) zero. 

7. Let (Z, +, •) be the ring of integers and / : Z — > Z be defined by f(x) = — x, 
Wx e Z. Then 

(a) / is a group homomorphism but not a ring homomorphism. 

(b) / is a ring homomorphism. 

(c) / is neither a ring homomorphism nor a group homomorphism. 

(d) / is a ring isomorphism. 

8. Let R[v] be the polynomial ring with real coefficients. Then 

(a) R[x] is a field. 

(b) R[x] is an integral domain but not a field. 

(c) RM is a commutative ring with zero divisors. 

(d) R[x] is a commutative ring with no multiplicative identity. 

9. Let R = C([0, 1]) be the set of all real- valued continuous functions on [0, 1]. 
Define + and • on C([l, 0]) by 


(f + g)(x) = f(x) + g(x) 
(f -g)(x) = f(x)-g(x), 


V/, g e C([0, 1]) and Vx e [0, 1], 
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Then 

(a) (R, +, •) is not a ring. 

(b) (/?,+,•) is a ring but not an integral domain. 

(c) (R, +, •) is an integral domain. 

(d) (/?,+,•) is not a non-commutative ring with zero divisors. 

10. Let (M, +) be an abelian group and R = End(M) be the set of all endomor- 
phisms on M, i.e., End(M) be the set of all group endomorphisms of M into 
itself. Define + and • on R by 

(/ + #)(*) = /(*)+£(*)] 

| V/, g e R and Vx e M. 

(f ■ g)(x) = f(g(x)), ) 

Then 

(a) (/?,+,•) is not a ring. 

(b) ( R , +, •) is a commutative ring. 

(c) (/?,+,•) is a non-commutative ring with identity element. 

(d) (/?, +, •) is an integral domain. 

11. Let (Z, +, •) be the ring of integers and R = End(Z) be the set of all endomor- 
phisms of (Z, +), i.e., End(Z) be the set of all group homomorphisms of Z into 
itself. Define + and • on R by 

(f + g)(x) = f(x) + g(x) ] 

> V/, g e R and Vv e Z. 

(f -g)(x) = f(g(x)), J 

Define / : Z — > R by /(n) = f n , where /„ : Z Z is given by / w (jc) = nv, 
Vi eZ. Then 

(a) / is not a ring homomorphism. 

(b) / is a ring homomorphism but not a ring monomorphism. 

(c) / is a ring isomorphism. 

(d) / is a ring homomorphism but not a ring epimorphism. 

12. Let R = C([0, 1]) be the ring of all real-valued continuous functions on [0, 1] 
and ^ : R — > R (where R is the set of all real numbers) be the map defined by 
VK/) = /(l/5).Then 

(a) ^ is a ring epimorphism. 

(b) ^ is a ring homomorphism but not a ring monomorphism. 

(c) ^ is a ring monomorphism. 

(d) ifr is ring homomorphism but not a ring epimorphism. 

13. The ring Z and Z x Z are not isomorphic. Because 

(a) one is commutative but the other is not. 

(b) one has zero divisor but the other has not. 
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(c) their cardinalities are different. 

(d) one is countable but the other is not. 

14. Let C = {/ : R -> R : / is continuous on R} and D = {/ e C : / is 
differentiable on R} Define + and • on C by 

(f + g)(x) = f(x)+g(x)] 

\ V/, g e C and Vx e R. 

(/■*)(*) = /(*)■*(*), \ 

Then 

(a) C is not a ring. 

(b) C is ring but D is not a subring of C. 

(c) C is ring and D is a subring of C. 

(d) both C and D are integral domains. 

15. Let Q be the field of all rational numbers. Then the number of non-trivial field 
homomorphisms of Q into itself is 

(a) one (b) two (c) zero (d) infinitely many. 

16. Let (Z, +, •) be the ring of integers and (E, +, •) be ring of even integers. Then 

(a) E is an integral domain. 

(b) Z and E are isomorphic rings. 

(c) Z and E are non-isomorphic rings. 

(d) both Z and E are integral domains. 

17. Let n be the number of non-trivial ring homomorphisms from the ring Z 12 to 
the ring Z 28 • Then n is 

(a) 7 (b) 1 (c) 3 (d) 4. 

18. Which of the following rings are integral domains? 

(a) R[x], the ring of all polynomials in one variable with real coefficients. 

(b) D[0, 1], the ring of continuously differentiable real- valued functions on the 
interval [0, 1] (with respect to pointwise addition and pointwise multiplica- 
tion). 

(c) C[x], the ring of all polynomials in one variable with complex coefficients. 

(d) M n ( R), the ring of all n x n matrices with real entries. 

Exercises B (True/False Statements) Determine the correct statement with jus- 
tification'. 

1. The ring Z x Z is an integral domain. 

2. The rings (Z(V^), +, •) and (Z(V3), +, •) are isomorphic, where Z(V2) = 
{a + b*J 2 : a, b e Z} and Z(V3) = {c + d\J 3 : c, d e ZJ. 

3. There exists a commutative ring R with 1 such that the rings (R[jc], +, •) and 
(Z, +, •) are isomorphic. 

4. The field of quotient of (Z, +, •) is isomorphic to (Q, +, •)• 
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5. Every non-zero element of an integral domain D is a unit in the quotient field 
Q(D) of D. 

6. The quotient fields of two isomorphic integral domains are isomorphic. 

7. Let R be a commutative ring such that a 2 = a,Va e R. Then a = —a,Va e R. 

8. The rings ( Z n , +, •) and End(Z n , +) are isomorphic. 

9. The ring EndCZ^, +) is a field for every prime p > 1. 

10. The rings (Z 2 x Z 2 , +, •) and End(Z 2 x Z 2 , +) are isomorphic. 

11. Any homomorphism of a field F onto itself is an automorphism of F. 

12. A non-zero element (a) e Z n is invertible in the ring Z n gcd(a, n) = 1. 

13. A commutative ring D with 1 is an integral domain D is a subring of a field. 

14. Every field contains a subfield F such that F is isomorphic either to the field Q 
or to one of the fields Z p for some prime p. 

15. The invertible elements of R[jt] are precisely the power series in R[jc] with 
non-zero constant terms. 


4.9 Additional Reading 

We refer the reader to the books (Adhikari and Adhikari 2003, 2004; Artin 1991; 
Atiya and Macdonald 1969; Birkoff and Mac Lane 1965; Burtan 1968; Chatterjee 
et al. 2003; Fraleigh 1982; Herstein 1964; Hungerford 1974; Jacobson 1974, 1980; 
Lang 1965; McCoy 1964; Simmons 1963; van der Waerden 1970) for further details. 
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Chapter 5 

Ideals of Rings: Introductory Concepts 


In this chapter we continue the study of theory of rings with the help of ideals and 
we show how to construct isomorphic replicas of all homomorphic images of a spec- 
ified abstract ring. In ring theory, an ideal is a special subring of a ring. The concept 
of ideals generalizes many important properties of integers. Ideals and homomor- 
phisms of rings are closely related. Like normal subgroups in the theory of groups, 
ideals play an analogous role in the study of rings. The real significance of ideals in 
a ring is that they enable us to construct other rings which are associated with the 
first in a natural way. Commutative rings and their ideals are closely related. Their 
relations develop ring theory and are applied in many areas of mathematics, such as 
number theory, algebraic geometry, topology and functional analysis. In this chap- 
ter basic concepts of ideals are introduced with many illustrative examples. Ideals of 
rings of continuous functions and the Chinese Remainder Theorem for rings with its 
applications are studied. We study ideals of various important rings and prove differ- 
ent isomorphism theorems. Moreover, applications of ideals to algebraic geometry 
and the Zariski topology are discussed in this chapter. 


5.1 Ideals: Introductory concepts 

The concept of ideals came through the work of Cartan, J.H. Wedderbum, E. 
Noether, E. Artin. Many others also made its significant applications in the theory 
of rings and algebras. An ideal of a ring R is a subring of R which is closed with 
respect to multiplication on both sides by every element of R . It is now important to 
carry over many results of group theory related to normal subgroups to ring theory. 
Ideals of commutative rings are developed in the study of commutative algebra and 
applied to many areas of mathematics. The theory of ideals of rings of algebraic in- 
tegers prescribes methods for solving many problems of number theory. Analogous 
to a normal subgroup, a subring that is the kernel of a ring homomorphism is what 
we call an ideal. 

Unless otherwise stated, R will denote an arbitrary ring having more than one 
element. 
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Definition 5.1.1 Let A be a non-empty subset of a ring R such that A is an additive 
subgroup of R . Then 

(i) A is a left ideal of R iff ra £ A for each a e A and r £ R; 

(ii) A is a right ideal of R iff ar £ A for each a £ A and r £ R\ 

(iii) A is an ideal of R iff it is both a left ideal of R and a right ideal of R. 

Remark Conditions (i) and (ii) assert that A swallows up multiplication from the 
left and from the right by arbitrary elements of the ring respectively. 

R has two obvious ideals, namely, {0} and R itself. 

Clearly, if the ring R is commutative, the concepts of left-, right- and two-sided 
ideals coincide. 

Remark Let A be a left ideal (right ideal) of R. Then a,beA^a — b,abeA^A 
is a subring of R (by Theorem 4.2.1). 

The converse of the result is not true. 

Example 5.1.1 (a) Let R be a ring. Consider the matrix ring M 2 {R). Then in M 2 (7?) 

(i) the subring A = {(“fy : a, b e R} is a right ideal but not a left ideal; 

(ii) the subring B = {^^) : a, b e R} is a left ideal but not a right ideal; 

(iii) the subring C2 = {(qo) is neither a left ideal nor a right ideal. 

(b) The ring R is not an ideal of 7?[v], because R does not contain af{x)VaeR 
and Vf{x)eR[x]. 

Theorem 5.1.1 The intersection of any arbitrary collection of {left, right) ideals of 
a ring R is a {left, right) ideal of R. 

Proof Let {A/ } be an arbitrary collection of ideals of R and M = Hz A;- Then 
M / 0, since 0 £ each A; 0 £ M. Again a,b £ M and r e R ^ a — be A/, 
ar, ra £ A/ for each i ^ a — b, ar,ra e M ^ M is an ideal of R. □ 

Ideals generated by a finite number of elements or a single element play an impor- 
tant role in the study of rings. We now consider an arbitrary ring R and a non-empty 
subset A of R. By the symbol (A) we mean the set 

(A) = O : A C /; / is an ideal of 7?}. 

Clearly, (A) ^ 0. Thus (A) exists and satisfies A c (A). Then by Theorem 5.1.1, 
(A) forms an ideal of R, known as the ideal generated by the set A. Moreover, for 
any ideal 7 of R with A c 7 (A) c 7 => (A) is the smallest ideal of R to contain 

the set A. 

If in particular, A = {a\ , a 2 , . . . , a n ], then the ideal which A generates is denoted 
by (a\ , a 2 , . . . , a n ) . Such an ideal is said to be finitely generated with a\ , a 2 , . • • , a n 
as its generators. An ideal {a) generated by just one element a of R is called a 
principal ideal generated by a. 
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Remark The forms of the ideal (a) in cases of commutative and non-commutative 
rings are given in Ex. 1 in Exercises-I. 

Proposition 5.1.1 Every ideal of the ring of integers (Z, +, •) is a principal ideal. 

Proof Clearly, {0} is an ideal of (Z, +, •)• Let A be a non-zero ideal of (Z, +, •)• 
Then the positive integers in A have a least element u (say) by well ordering prin- 
ciple (see Chap. 1). Let v e A. By division algorithm, we can express v = qu -\-r, 
where 0 < r < u. Now r = v — qu e A and u is the least positive integer in A =>- 
r = 0 =>► v is a multiple of u => A c (u) = {nu :n e Z} (an ideal) c A=$> A = (u). 
Thus every ideal of (Z, +, •) is a principal ideal. □ 

Remark Let I be an ideal of Z. Then there exists a unique non-negative integer d 
such that I = (d), i.e., I = dZ. This follows from the following fact: 

The case with zero ideal is trivial. For the case of a non-zero ideal, let dZ = eZ, 
for some non-negative integers d and e. Then d divides e and e divides d, resulting 
in d = ±e. As both e and d are non-negative, e = d. In particular, (2) = Z g , ring of 
even integers forms an ideal of Z. Clearly, (0) = {0} and (1) = Z. 

Definition 5.1.2 A ring R is said to be a principal ideal ring iff every ideal A of R 
is of the form A = (a) for some a e R. 

Proposition 5.5.1 shows that (Z, +, •) is a principal ideal ring. 

Every ring R has at least two ideals viz. {0} and R. The ideals {0} and R are 
called trivial ideals. Any other ideals (if they exist) are called non-trivial ideals. By 
a proper ideal of R we mean an ideal different from R . 

Definition 5.1.3 A ring R is said to be simple iff it has no non-trivial ideals. 

Theorem 5.1.2 A division ring has no non-trivial ideals. 

Proof Let R be a division ring and A an ideal of R. If A ^ {0}, then A contains 
an element g ^ 0. Now g~ l g = 1 e A as g~ l e R. Hence rl=reAVre/?=^ 
R c A. Consequently, R = A. □ 

Corollary Every division ring is a simple ring. 

Remark The converse is not true (see Ex. 2 of the Worked-Out Exercises). 

Remark The join of two ideals of a ring R is not necessarily an ideal, since the join 
of two subrings is not necessarily a subring. 

Definition 5.1.4 The sum (or hcf) of two ideals P and Q of a ring R denoted by 
P + Q is defined by P + Q = {p + q:peP and q e QJ. 
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Clearly, P + Q is an ideal of R such that it is the smallest ideal of R which 
contains both P and Q . 

Note that the set {pq : p e P and q £ Q} is not an ideal of R. 

To avoid this difficulty, we define the product of the ideals P and Q as follows: 

Definition 5.1.5 The product of two ideals P and Q of a ring R denoted by P Q is 
defined by 

PQ = I 'Yl, PlC/i : Pi e p and cp e c 

finite 

Clearly P Q is an ideal of R such that P Q c P D Q. 


5.2 Quotient Rings 

In this section we continue the discussion of ideals. An ideal of a ring produces 
a new ring, called a quotient ring, which is closely related to the mother ring in a 
natural way. The concept of a quotient ring is similar to the concept of a quotient 
group and hence the method of construction of a quotient ring is borrowed from the 
corresponding construction of a quotient group. In this section we bring over to rings 
various results, such as homomorphism and isomorphism theorems, correspondence 
theorem etc., which are counterparts of the corresponding results of groups and 
establish the relation of ideals to homomorphisms. 

Let ( R , +, •) be a ring and I an ideal of R. Then (/, +) is a subgroup of the 
abelian group ( R , +) and hence the quotient group (R/I, +) is defined under usual 
addition of cosets, i.e., (a + I) + (b + I) = a + b + /, Va, b e R. In this group, the 
identity element is the trivial coset I and the inverse of a + I is — a + /, Va e R. 

Theorem 5.2.1 Given an ideal I of a ring ( R , +, •)> the natural (usual) addition 
and multiplication of cosets, namely , (a + 1) + (b + 1) = a + h + 1 and (a + /)(& + 
I) = ah + I Wa, h e R, make (R/I, +, •) a ring. 

Proof Let x + I = a + I and y + I = h + I. Then x — a e I and y — h el. Now 
xy — ab = xy — xb + xb — ab = x (y — b) + (x — a)b e I => multiplication of cosets 
is well defined. As I is a normal subgroup of R, (R/I, +) is an abelian group. It can 
be verified that multiplication in R/I is associative and distributive property holds 
in R/I. So, (R/I, +, •) is a ring. □ 

Definition 5.2.1 Given an ideal I of a ring R , the ring R/I is called the quotient 
ring or factor ring of R by I (or modulo I). 

Remark If R is a commutative ring with identity, then R/I is also so. 

This successful construction of the quotient ring by an ideal enables us to bring 
over to rings the homomorphism theorems of groups. 

Theorem 5.2.2 shows that ideals and homomorphisms of rings are closely related. 
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Theorem 5.2.2 If f : R —> S is a homomorphism of rings , then K = ker f is an 
ideal of R. Conversely , if I is an ideal of R, then the mapping tt : R —> R/ 1 given 
by 7r(r) = r + I is an epimorphism with kernel I. 

Proof 0 g ker / => ker / ^ 0. Let x,y e K and r e R. Then fix — y) = fix) — 
f(y) = O' (zero of S) =y x — y £ K . Also, f(rx) = fir) fix) = /(r) • O' = O' => 
rx G K. Similarly, xr e K. Consequently, K is an ideal of R. 

Conversely, let I be an ideal of R. Consider the function n : R — > R/ 1 given by 
7t(r) = r + I. Now tv (r + t) = (r + f) + I = (r + I) + (£ + /) = 7t(r) + 7t(£) and 
jr(rt) = rt + I = (r I) it + /) = 7r(r)7r(0, Vr, £ e 7? =>- 7r is a ring homomor- 
phism. Clearly, n is an epimorphism. 

Now, kerjr = {r g R : 7r(r) = /} = (r g i? : r + / = /} = (r g 1? : r e /} = 
I => I is the kernel of the homomorphism 7r : R R/ 1 . □ 

Corollary / A an ideal of a ring R iff it is the kernel of some homomorphism from 
R onto R/I . 

Note The homomorphism n : R ^ R/I is called the canonical or natural epimor- 
phism. 

Theorem 5.2.3 (Fundamental Homomorphism Theorem or First Isomorphism The- 
orem) Let ( R , +, •) and (S, +, •) be rings and f : R —> S be a ring homomorphism. 
Then 

(i) K = ker / is an ideal of R ; 

(ii) the quotient ring ( R/K , +, •) is isomorphic to the image f(R) under the map- 
ping f : R/K -* S given by f(r + K) = f(r). 

Proof (i) follows from Theorem 5.2.2. 

(ii) First we show that / is well defined and one-one. Now x + K = y-\-K<$> 
x-yeK & f(x - y) = 0 / 4^ fix) - f(y) = 0' ^ fix) = fiy) ^ fix + K) = 
fiy + K)Vx,yeR^ (forward implication) shows that / is well defined and <^= 
(backward implication) shows that / is one-one. 

Again, 

f((x + K) + (y + K)) = fix +y + K) = f(x + y) = f(x) + fiy) 

= fix + K) + fiy + K) 
and 

/((* + K)iy + K)) = fixy + K) = fixy) = fix) fiy) 

= fix + K)fiy + K) Vx,yeR 
=> f is a ring homomorphism. 
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Let x g /(/?). Then 3r g R such that f(r) = x. So 

f(r + K) = f(r) —x =>► f : R/K ^ f(R ) is surjective. 

Consequently, f : R/K -> /(/?) is an isomorphism. □ 

Note If in particular, / is an epimorphism, then is isomorphic to 5". 

Corollary Lef f : R ^ S be a homomorphism of rings with I = ker/. TTien 
exists a unique monomorphism f \ R/I S such that f = f on, where n is the 
natural epimorphism n : R —> R/ 7, also / A an isomorphism , iff f is an epimor- 
phism. 

Proof Define a monomorphism / :/?/7— by/ (r + 7) = / (r)VrG/L Then 
(/ o 7r)(r) = /(r + I) = /(r) WreR^f = fo7t. Moreover, if / is an epi- 
morphism, then / is an isomorphism by Note after Theorem 5 . 2 . 3 . Again / is 
an isomorphism =>> / is an epimorphism =>> / is an epimorphism. To prove the 
uniqueness of /, let g : R/I -> S' be a monomorphism such that f o jt = g o n . 
Now g(r + /) = g{n(r )) = (§ o n){r) = (/ o 7r)(r) = f{n(r)) = f(r + /), for all 
r + I e R/I . This implies / = g. □ 

Theorem 5.2.4 (Correspondence Theorem) Let (R, +, •) and ( 5 , +, •) be rings and 
f : R ^ S be an epimorphism. Then there exists a one-to-one correspondence be- 
tween the collection of ideals of R containing K (= ker /) and the collection of 
ideals of S. 

Proof Let I (R) and I (S) denote the collection of all ideals of R containing K and 
the collection of all ideals of S, respectively. Define / : I(R) — > I(S) by f(T) = 
f(T) G I(S) as / is a homomorphism and so, f(T) is an ideal of S. Let A, B e 
I(R) and /(A) = f(B). Then /(A) = f(B). We claim that A = B. Let a e A. 
Then 3b e B such that f(a) = f(b) => f(a — b) = Os^a — be ker / = K c 5 g 
7(7?) 4a = (fl-^) + kfi=)>aefiVaeA=^AC5. Similarly 5 c A. Hence 
A = B. Let M be an element of I(S). Then / -1 (M) = / _1 (M) is an ideal T (say) 
of R containing K. So, f(T) = M as / is an epimorphism (use /(/ _1 (M)) = 
M fl f(R) = M). Consequently, / is surjective. □ 

Corollary Let (R, +, •) be a ring and I an ideal of R. Then the ideals of R/I are 
in one-to-one correspondence with the ideals of R containing I . 

Proof Let tt : R — >► R/I be the canonical epimorphism. Then ker7r = I. Now apply 
Theorem 5 . 2 . 4 . □ 


Remark Any ideal in R/I is of the form 7 t(A) for some ideal A of R containing 7. 
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5.3 Prime Ideals and Maximal Ideals 

In this section, we continue discussion of ideals and quotient rings. More precisely, 
we study quotient rings R/I with the help of prime and maximal ideals I of R. So 
prime and maximal ideals are ideals of special interest. In this section we introduce 
the concepts of prime ideals and maximal ideals. These two concepts are different in 
general but coincide for some particular classes of rings. Their characterizations are 
also established in this section. We show that prime ideals of Z and prime integers 
are closely related. 

Throughout this section we assume that R is a commutative ring with identity 1 
(^0). 

Definition 5.3.1 A proper ideal P of R is said to be a prime ideal iff for any a, b e 
R, ab e P ^ a e P or b e P. 

Example 5.3.1 (i) Zero ideal {0} of an integral domain D is a prime ideal, since 
for a, b e D, ab = 0 a = 0 or b = 0. 

(ii) Consider the ring of integers (Z, +, •)• If P is a prime integer, then the 
principal ideal (p) in Z is a prime ideal, since mn e (p) =+ p\mn =+ p\m or 
p\n => m € (p) or n e (p). The ideal I = (6) = {6n : n e Z} is not prime, since 
3 • 2 = 6 e I but neither 3 el nor 2 el. 

Prime ideals of rings may be characterized in terms of their quotient rings. 

Theorem 5.3.1 A proper ideal I of a commutative ring R with identity 1 is prime 
iff the quotient ring R/ 1 is an integral domain. 

Proof Since R is a commutative ring with 1, R/I is also a commutative ring with 
identity element 1 + I and zero element 0 + I = I . 

Let I be prime. Clearly, 1 + 7^7, otherwise 1 + I = I would imply lei and 
hence I = R, a contradiction, since I is a proper ideal. 

We claim that R/I has no zero divisors. Let a + I and b + I e R/I. Then (« a + 
/)(/? T- /) = I =y ab + 1 = 1 =y ab e I =y a e I or b e I =y a + I = I or b + I = 
I =>► R/I has no zero divisors. Consequently, R/I is an integral domain. 

Conversely, let R/I be an integral domain and a,b e R such that ab el. Then 
ab + 1 = 1 =y (a + /)(/? H- /) = I =y a + I = I or b + I — / =y a e I or b e I =y I 
is a prime ideal. □ 

Definition 5.3.2 A proper ideal I of R is said to be a maximal ideal of R iff there 
is no ideal J of R such that I C / C R m 

Thus I is a maximal ideal of R , iff for every ideal N such that / 

/ = TV (or / C n c R => N = R). 
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Example 5.3.2 (i) (p) is a maximal ideal of Z for each prime integer p. To show 
this, suppose there exists an ideal J of (Z, +, •) such that (p) C J. Then 3 an integer 
n e J such that n £ (p) =>► n is not divisible by p => 1 = gcd(a, p). So we can write 
1 = nr + pt for some r, t e Z. Now n e J and p e (p) ^ J show that 1 e J => J = 
Z=^ (p) is a maximal ideal of Z. 

(ii) The ideal {0} is a prime ideal of Z but not a maximal ideal, since {0} C 
(2)C Z . 

Problem 1 A proper ideal I = (n) in (Z, +, •) is prime iff n = 0 or n is a prime 
integer. 

Solution Let / be a prime ideal of (Z, +, •)• Then by Proposition 5.1.1, I can be 
expressed as I = ( n ) for some n > 0. If n = 0, then I is prime by Example 5.3.1. 
Since (1) = Z, I is not proper for n = 1. Consider a > 1. If possible, let n be 
composite, i.e., n = rs , where 1 < r <n, l < s < n. Then rs = n e (n). But neither 
r nor s e (n), otherwise n\r or n\s, which is not possible. So, n cannot be composite. 
In other words n must be prime. Its converse follows from Example 5.3.1. 

Problem 2 A commutative ring with identity 1 is a field iff {0} is a maximal ideal. 

Solution Suppose R is a field. Then by Theorem 5.1.2, R has only two ideals {0} 
and R. Since {0} is the only proper ideal of R, {0} must be a maximal ideal of R. 

Conversely, suppose {0} is a maximal ideal of R. Let a (/0) e R. Now {0} C 
Ra => Ra = by maximality of {0}. Then 1 e Ra =>► ta = 1 for some t e R =+ t = 
a~ l e R. Consequently, R is a field. 

In the next theorem, we shall show that maximal ideals of rings may be charac- 
terized in terms of their quotient rings. 

Theorem 5.3.2 Let R be a commutative ring with identity 1 . A proper ideal I of R 
is a maximal ideal of R iff the quotient ring R/ 1 is a field. 

Proof Suppose I is a proper ideal of R such that the quotient ring R/ 1 is a field. We 
claim that I is a maximal ideal of R. Suppose there exists an ideal J of R such that 
I ^ J c R. Then 3 an element a e J such that a £ I. Hence a + 7/ 7=^a + 7isa 
non-zero element of R/ 1 3 an element b + I e R/I such that (b + I) (a 3- 1) = 

1 + 7 (since R/I is a field) =^Z?a + 7 = l + 7=>l — ba e I ^ l — ba e J (since 
I C j) l e j (since aeJ^baeJ)^J = R^I is maximal. 

Conversely, assume that 7 is a maximal ideal of R. Since R is a commutative ring 
with 1 and 7 is a proper ideal, R/I is a commutative ring with identity 1 + 7/7. 
Let a + 7 be a non-zero element of R/I. Then a + 7/ 7=^a^7. Consider the set 
A = {u + ra : u e I and r e R}. Then A is an ideal of R. Now u = u-\-0a^ueA. 
Also, a = 0-\-la^aeA. Hence 7 C a =>> A = R (by maximality of 7). Then 
3 elements u e I and r e R such that 1 = u + ra. Now 1 + 7 = (u + ra) + 7 = 
(m + 7) + (ra + 7) = ra + 7 (since a e 7, a + 7 = 7, the zero element of R/I)= 
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(r + I)(a + I). This shows that the inverse of a + I exists in R/I. Consequently, 
R/I is a field. □ 

Problem 3 Let R be a commutative ring with 1 (^0). Show that the following 
statements are equivalent: 

(i) R is a field; 

(ii) R has no non-trivial ideals; 

(iii) {0} is a maximal ideal of R. 

[Hint, (i) (ii) (by Theorem 5.1.2) =>- (iii) (i) (by Problem 2).] 

Remark A commutative ring R with 1 (^0) is a field every non-zero homomor- 
phism / : R T of rings is a monomorphism. 

Theorem 5.3.3 Let R be a commutative ring with 1 . Then every maximal ideal of 
R is a prime ideal. 

Proof 

Let I be a maximal ideal of R and a,b e R such that ab e I. (5.1) 

We claim that either a e I or b e I. Suppose a £ I. Consider the set A — {u-\-ra\ 
u e I and r e R}. Then proceeding as in the second part of Theorem 5.3.2, we 
find that A = R. Then 3 elements u e I and r e R such that 1 = u + ra. Hence 
b = ub + rab =>► b e I by (5.1). □ 

Alternative Proof In this case, I is a maximal ideal of R <=> R/I is a field => R/I 
is an integral domain => I is a prime ideal of R. □ 

Remark The converse of Theorem 5.3.3 is not true. It follows from Example 5.3.3. 

Example 5.3.3 (i) Let Z be the ring of integers. Consider the ring R = Z x Z, where 
operations are defined componentwise. Then Z x {0} is a prime ideal of R. Since 
Z x (2) is an ideal of R such that Z x {0} C Z x (2) ci?,Zx{0} cannot be maximal 
in R. 

(ii) See Example 5.3.2(ii). 

(iii) See Ex. 7 of the Worked-Out Exercises. 

In the following theorem, we shall now show that in Boolean rings and principal 
ideal domains the concepts of prime ideals and maximal ideals coincide. 

Theorem 5.3.4 Let R be a Boolean ring. An ideal I of R is prime iff I is a maximal 
ideal. 

Proof Let I be a prime ideal of R. We claim that I is a maximal ideal. Let J be an 
ideal of R , such that / ^ J c R. Since / C /, 3 an element a e J such a £ I. Now 
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a{ 1 — a) = a — a 2 = a — a = 0 e I ^ either ael or 1— ael^l— ael^J 
(as a £ I) =>► 1 e J (as a e J) => J = R => I is a maximal ideal of 7?. 

Conversely, if I is a maximal ideal, then I is a prime ideal by Theorem 5.3.3. □ 

Definition 5.3.3 A principal ideal domain (PID) R is a principal ideal ring which 
is an integral domain. 

Theorem 5.3.5 Let R be a principal ideal domain. Then a non-trivial ideal I of R 
is a prime ideal iff I is a maximal ideal. 

Proof Let I be a non-trivial prime ideal of R. Suppose J is an ideal of R such that 
/ ^ / c R. Since R is a PID, we write I = {a) and J = (b) for some a,b e R\ {0}. 
Now a e {a) C (fi) => a = rb for some r e R => either r e {a} or b e {a) as (a) is 
prime. But b <£ (a), otherwise (b) c (a) leads to a contradiction. So, r e {a} => r = 
ta for some t e R => a = rb = ( ta)b = ( tb)a =>► tb = 1 (as R is an integral domain 
and a/0)=^lG(Z?) = /=^/ = 7?=^/isa maximal ideal of R. 

The converse part follows from Theorem 5.3.3. □ 

Corollary A non-trivial ideal of ( Z, +, •) is maximal iff it is prime. 

The following theorem is an interesting application of Zorn’s Lemma. 

Theorem 5.3.6 If a ring R is finitely generated , then each proper ideal of R is 
contained in a maximal ideal. 

Proof Let A be a proper ideal of a finitely generated ring R. Suppose R = 
(jci, X 2 , . . . , x n ). Let D be the set of all ideals B of R satisfying A c B C R, par- 
tially ordered by set inclusion. Then AgD=^D^ 0. Consider a chain {Bi :i e 1} 
in D. Then T = |J- e/ Bi is an ideal of R. 

We claim that T is a proper ideal of R. If possible, let T = R = (jq, X 2 , . . . , x n ). 
Then each generator Xk (k = 1, 2, . . . , n) would belong to some ideal B ^ of the 
chain {Bi}. As there are only finitely many Bik, one contains all others, call it B(. 
Thus x\, X 2 , . • • , x n all lie in this one Bf. Consequently, Bf = R, which is not pos- 
sible. Again A c |J. Bi = T ^ R => T cD. Hence by Zorn’s Lemma the family 
D contains a maximal element M. Then M is an ideal of R with A C M C R, We 
claim that M is a maximal ideal of R. To show this, let J be an ideal of R such that 
M ^ / c R. Since M is a maximal element of the family D, / ^ D. Accordingly, J 
cannot be a proper ideal of R. Hence J = R. This shows that M is a maximal ideal 
of R. □ 

Corollary (Krull-Zorn) Let R be a ring with identity 1 . Then each proper ideal of 
R is contained in a maximal ideal of R. 

Proof The proof is immediate, since R = (1), by Theorem 5.3.6. □ 
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Remark In Appendix A, we present generalizations of some concepts of rings to 
the corresponding concepts for semirings. 


5.4 Local Rings 

The concept of local rings was introduced by W. Krull (1899-1971) in 1938. Rings 
with a unique maximal ideal are of immense interest. 

Definition 5.4.1 A local ring is a commutative ring with identity which has a 
unique maximal ideal. 

Since every proper ideal of a ring R with 1 is contained in a maximal ideal of R 
(see the Corollary of Theorem 5.3.6), it follows that the unique maximal ideal of a 
local ring R contains every proper ideal of the ring. 

All fields are local rings. 

Example 5.4.1 Let R be a commutative ring with 1 in which the set of all non-units 
of R forms an ideal M. Then R is a local ring. This is so because M is a maximal 
ideal of R and that M contains any proper ideal of R. Such a ring is a local ring. 
To show this, let I be an ideal of R such that M C / c R. Then 3 a unit element 
a (say) in I . Hence I = R =>► M is maximal. Next let A be any proper ideal of R. 
Then every element of A must be non-unit. Hence ACM. 

We now search for another example of a local ring. 

Theorem 5.4.1 For any field F , the power series ring F[jc] is a principal ideal 
domain. 

Proof Let A be any proper ideal of F[v]. If A = {0}, then A = (0). If A ^ {0}, 
then 3 a non-zero power series f(x) e A of minimum order r (say), so that f(x) = 

a r x r +a r + iv r+1 H = x r (a r +a r +\x H ), where a r ^ 0. Then a r +a r +\x H 

is invertible in F[jc] by Lemma 4.5.1 of Sect. 4.5 of Chap. 4. Hence f(x) = x r g(x), 
where g(x) has an inverse in F[jc] => x r = f(x)g(x)~ l e A => (x r ) c A. For the 
reverse inclusion, let t(x) be any non-zero power series in A of order n. Then r < 
n => t(x) can be written in the form: 

t(x) = x r (b n x n ~ r + b n+l X n ~ r+l + •••)€ [x r ). 

Then A c (x r ). Consequently, A = (x r ). □ 

Corollary For any field F, F[jc] is a local ring with (x) as its maximal ideal. 

Proof As the non-trivial ideals of F[jc] are given by (x r ) for positive integers r > 1, 
it follows that F[jc] ^ (x) ^ {x 2 ) ^ ^ {0}. This proves the corollary. □ 


214 


5 Ideals of Rings: Introductory Concepts 


5.5 Application to Algebraic Geometry 

Classical algebraic geometry arose through the study of the sets of simultaneous 
solutions of systems of polynomial equations in n variables over an algebraically 
closed field, like the field of complex numbers C. The theory of Noetherian rings 
and the Hilbert Basis Theorem (to be discussed in Chap. 7), prescribe methods for 
solutions of many problems of algebraic geometry. So the ideal theory imposes a 
considerable influence on algebraic geometry. For example, affine varieties play an 
important role in algebraic geometry and Zariski topology. Moreover, affine vari- 
eties are characterized in this section with the help of prime ideals. Oscar Zariski 
and Andre Weil redeveloped the foundation of algebraic geometry in the middle of 
the 20th century. 


5.5.1 Affine Algebraic Sets 

Let K be an algebraically closed field (i.e., every polynomial of one variable of 
degree >1 over K has a root in K). An affine n-space over K , denoted by K n is 
the set of all n -tuples (jc) = (x \ , X 2 , . . . , x n ) of elements of K. We call K l the affine 
line and K 2 the affine plane. The n-tuple (x) = (x\, X 2 , . . . , x n ) is called a point of 
K n . The ft -tuple (x) is called a zero of a polynomial f(x) = /(x i,X 2 , . . . , x n ) e 
K[x i ,X 2 ,..., x n \ iff f(x) = f(x i ,x 2 ,...,x n ) = 0. 

Given a subset S of polynomials in K[x i, X 2 , . . . , x n ], the algebraic set of ze- 
ros of S, denoted by V (S ), is the subset of K n consisting of all common zeros 
of polynomials in S i.e., V(S) = {(x) e K n : /(x) = 0 V/ e S}. A subset A of 
K n is called an affine algebraic set, iff 3 a subset S of K[x\, X 2 , . . . , x„] such that 
A = V(S). An affine algebraic set is sometimes called (simply) an algebraic set. 
The set of all zeros of a single polynomial f(x i , xf) i.e., the set of points of K 2 sat- 
isfying the polynomial equation f(x i, X 2 ) = 0 over K is called an affine algebraic 
curve in the affine plane K 2 . Similarly, the set of all zeros of a single polynomial 
equation f(x\,X 2 ,xf) = 0 is called an affine algebraic surface in K 3 . The affine 
algebraic set in K 3 determined by two polynomial equations f(x\, X 2 , X 3 ) = 0 and 
g(x \ , X 2 , X 3 ) = 0 is called a space curve. If f(x \ , X 2 , . . . , x n ) is not a constant, then 
the set of zeros of / is called the hypersurface defined by / and is denoted by V (/). 

We now present some elementary properties of algebraic sets in affine n- 
space K n . 

Proposition 5.5.1 Let K n be an affine n-space. Then 

(i) P Q Q in K[x 1 , JC 2 , . • • , x n ] implies V (P) 3 V(Q) in K n . 

(ii) y (0) = K n , V(a) = 0, where ay^OandaeK. 

(iii) If{P a } is a family of ideals in K[x \ , X 2 , . . . , x n ], then V ((J a P a ) = p| a V ( P a ). 

(iv) If P and Q are ideals in K[x i,X 2 , . . . , x n ] and A = V (P) and B = V(Q), 
then AUB = V(P (T Q) = V(PQ). 

(v) Any finite subset of K n is an algebraic set. 
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Proof (i) (x) e V(Q) O (x) is a common zero of polynomials of Q => (x) is also 
a common zero of polynomials of P => (x) e V (P) => V (Q) c V (P). 

(ii) As every point of K n satisfies the zero polynomial, V (0) = K n . As the con- 
stant non-null polynomial ‘ a ’ has no zero, V (a) = 0. 

(iii) (x) e V (IJ„ p a) O /(*) = 0 V/ € (J a ^ /(*) = 0 Vf € P a , where P a 
is any member of the collection {P„)^(r)ey ( P a ) Va (x) e p| V (P u ). 

(iv) (x) e A U B = V (P) U V (Q) =>► (x) is a zero of all polynomials in P or (x) 
is a zero of all polynomials in Q => (x) is a zero of all polynomials in D 2 => (x) 
is a zero of all polynomials in P Q (since PQ < ^PnQ)^AUB < ^V(PP\Q)^ 
V(PQ). 

Again (x) ^ A U B =>- 3f e P and g e Q such that /(x) 0 and g(x) ^ 0 => 
fg e P f! Q (also to PQ) does not vanish at (x) =>► (x) ^ V(P Pi Q ) and also 
(x) ^ V (P 2) => the zeros of P D Q (as well as PQ) are the points A U B and only 
these. 

(v) Any point (xi , X2, . . . , x n ) e K n is an algebraic set, because V(X\—x\,X 2 — 

X 2 , . . . , X n — x n ) = {(xi , X 2 , • • • , x w )}. Hence any finite subset A of K n is an alge- 
braic set by (iv). □ 

Using the fact that any algebraic set is of the form V (P a ) it follows from Propo- 
sition 5.5.1 that 

Corollary Let K n be an affine n-space. Then 

(i) the intersection of any collection of algebraic sets in K n is an algebraic set ; 

(ii) the union of any two algebraic sets in K n is an algebraic set. 

We now determine all the algebraic sets of the affine line K l . 

Proposition 5.5.2 A subset A of K is algebraic iff either A = K l = K or a finite 
subset of K l . 

Proof We have seen that the condition is sufficient. To prove the necessity of the 
condition, note that as K[x] is a PID and a non-zero polynomial / e K[x] has a 
finite number of zeros < deg /, it cannot vanish on an infinite set. □ 

Remark Proposition 5.5.2 shows that an infinite union of algebraic sets may not be 
an algebraic set. 


5.5.2 Ideal of a Set of Points in K n 

Given a subset A in K n , define the subset 1(A) in K[x \ , X 2 , . . . , x n ] to be the set of 
all polynomials which vanish on A i.e., 


I(A) = {feK[xi,x 2 ,...,x n ]:f(.x) = 0V(x)e A}. 
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Then 7(A) is an ideal of K[x\, X 2 , . . . , x n ]. To show this, let /, g G /(A). Then 
fix) = 0 and g(x) = 0 V(x) e A. Consequently, (/ + g)(x) = fix) + gix) = 0 
V(x) g A and ( hf)ix ) = hix)f(x) = 0 V(x) g A and V7z g 7f[xi, X 2 , ... , x w ]. 

Remark /(A) is called the w/ea/ o/ A. 

Proposition 5.5.3 K n be an affine n- space. Then 

(i) AC5 in K n => /(A) 5 7(fl) w . . . , x n ]; 

(ii) S c /(V iS)) for any set of polynomials S in K[x i, . . . , 

(iii) A C 7 (/ (A)) /6>r a/ry of points A in K n ; 

(iv) y(S') = V (I (V (S))) for any set of polynomials S in K[x i,X 2 , ... ,x n ]; 

(v) /(A) = I {V (I (A))) for any set of points A in K n \ 

(vi) Ufl) = K[x\,X 2 ,...,X n \, 

(vii) I(K n ) = {0}, if K is an infinite field. 

Proof (i) / G 1(B) / vanishes on B =y / vanishes on A (since A c 5) / e 

7(A) => 7(5) c 7(A). 

(ii) This follows from the fact that S is a set of some polynomials which vanish 
on 7 ( S ), while 7(7(5*)) is the set of all polynomials which vanish on V ( S ). 

(iii) This follows from the fact that A is a set of some common zeros of poly- 
nomials in /(A), while 7 (7 (A)) is the set of all common zeros of polynomials in 
1(A). 

(iv) ViS) c 7(7(7(5))) by (ii) (take ViS) for A in (iii)). 

Conversely, 7(7(7(5))) c ViS). Because (x) e V(I(V(S))) O fix) = 0 V/ g 
I (ViS)) => fix) = 0 V/ g 5 by (ii) ^ig V(S). 

(v) 1(A) c 7(7(7 (A))) by (iii) (take 1(A) for S in (iii)). 

Conversely, 7(7(7 (A))) c /(A) by (iii). This is so because / g 7(7(7 (A))) 
f(x) = 0 V(x) G 7(7 (A)) => f(x) = 0 V(x) G A by (iii) =► / g 7(A). 

(vi) 7(1) = 0 => 7(7(1)) = 7(0). Now by (ii), 1 g 7(7(1)) =► 7(0) = 

K[X \,X 2 , . . .,x n ]. 

(vii) This part says that if K an infinite field and / G K [x\, X 2 , . . . , x n ], then / 

vanishes on K n implies / = 0. We shall prove this by induction on n. If n = 1, 
a non-zero / G K[x\] has a finite number of zeros <d eg/, by the fundamen- 
tal theorem of algebra (see Chap. 11), and hence / cannot vanish on the infi- 
nite set K , unless / = 0. Next, suppose that the result is true for (n — 1) vari- 
ables. Let / G K[x i,X 2 , . . . , x n ] vanish on K n . We write f(x i,X 2 , . . . , x n ) = 
Y.ifi(x \ , x 2 , ■ ■ ., x n -\)x l n as a polynomial in x n with coefficients in K[x i, . . . , 
x n -\]. If 3 a point (a\, a 2 , . . . , a n -\) e K n ~ l such that for some j, fj(a\ , 
. . . , a n -\) ^ 0, then the non-zero polynomial f(a\,a 2 ,..., a n - i,x n ) G K [x n ] van- 
ishes on K , since f(x i, . . . , x n ) vanishes on K n . This implies, by the fundamental 
theorem of algebra, that f(a\,a 2 , . . . , a n - \,x n ) = 0, which is impossible, since 
fj (a\, ... , a n - 1 ) / 0. Therefore fj must vanish on K n ~ l V j , and hence fj = 0 Vj, 
by induction hypothesis. Consequently, / = 0. □ 
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Remark If A is an algebraic set and A = V(S), then A = V(7(A)); and if P is an 
ideal of an algebraic set A and P = /(A), then P = I (V ( P )). 


5.5.3 Affine Variety 

An algebraic set A in K n is called irreducible or an affine variety iff A B U C, 
where B and C are algebraic sets in K n and A=fB,A=fC i.e., iff A is not the union 
of two algebraic sets B and C distinct from A. Otherwise A is called reducible. 

Theorem 5.5.1 An algebraic set A is an affine variety iff 1(A) is a prime ideal. 

Proof Suppose A is an algebraic set and P is its ideal, i.e., P = 1(A). Then 
A = V (P) by Proposition 5.5.3(iv). If P is not prime, we can find /, g e 
K[x i, JC 2 , . • • , x n ] such that / £ P , g £ P but fg e P. Now by Proposition 5.5. l(i) 
and (iii), we get 

fiP =► {/}=}/> =► v(P)nv(f) = v(pu{f})^v(py, 

g£P =► =>. V(/>)nV(g) = V(PU{£})Cy(/>). 

Also 

[v(P) n V(/)] u [v(P) n v( g )] = V(P) n [v(f) u V(#)] 

(by distributive laws of sets) = V(P) HV(fg) (by Proposition 5.5.1(iv)) = V (P U 
(fg)) = V (P), since fg e P. Thus the algebraic set A = V (P) is the union of two 
algebraic sets V(P)HV(f) and V(P)C\V(g), which are distinct from A, and hence 
A is reducible. 

Conversely , if A is reducible, then A = B U C, where 5 and C are algebraic 
sets different from A. Then A g B =► / (A) C /(£) a nd A ^ C /(A) C /(Q. 
Therefore, if / g 7(5), g g 1(C) and /, g ^ 7(A), then fg e 7(A). This is so be- 
cause A = 5UC = V(7(7?))U V(7(C)) (by Proposition 5.5.3) = V(7(7?)7(C)) (by 
Proposition 5.5.1(iv) and hence 7(A) = I (V (1(B) 1(C))) 5 7(5) 7(C) (by Propo- 
sition 5.5.3 (ii)) => 7(A) is not prime (see Ex. 1 1(c) of Exercises-I). We have proved 
that an algebraic set A is reducible 7(A) is not a prime ideal. Equivalently, A is 
an affine variety 7(A) is a prime ideal. □ 

Corollary An affine algebraic set A in K n is an affine variety iff the coordinate 
ring K[x\,X 2 , . . . , x n ]/I(A) is an integral domain. 


5.5.4 The Zariski Topology in K n 

Zariski topology is a particular topology introduced by Oscar Zariski (1899-1986) 
around 1950 to study algebraic varieties. This topology in C n has an advantage 
over the metric topology, because Zariski topology has many fewer open sets than 
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the metric topology. A refinement of Zariski topology was made by Alexander 
Grothendieck in 1960. 

Let k be a subfield of K. If A is an affine algebraic set in K n such that 1(A) 
admits a set of generators in k[x\, X 2 , . . . , x n ] c K[x\ , X 2 , . . . , x n ], then A is called 
an affine (K, tc) algebraic set and k is called the field of definition of A. Thus an 
affine ( K , k) algebraic set A is a subset in K n consisting of all common zeros of 
a subset of polynomials (or an ideal) in k[x\, X 2 , . . . , x n \. If k = K, we call A an 
absolute affine algebraic set (or simply an affine algebraic set) in K n . Define a 
topology in K n , called the Zariski topology or k -topology in the following way: 

A subset U of K n is an open set iff K n — U is an affine k -algebraic set. Thus the 
affine k -algebraic sets in K n are the closed sets in K n . As the results of Propositions 
5.5. l(ii— iv) for affine algebraic sets are also true for affine (K,k) algebraic sets, it 
follows that the empty set and the whole set are closed sets; the intersection of any 
arbitrary family of closed sets is a closed set; and the union of two closed sets is a 
closed set. From all this the Zariski topology in K n follows. 

Any subset A of K n is given the induced topology from Zariski topology in K n . 
Thus if A is an affine k -variety in K n , then a subset of A is closed in A iff it is an 
affine k -algebraic set. 

Proposition 5.5.4 (i) The Zariski topology in K n is not Hausdorff. 

(ii) The Zariski topology in K n may not be T\ unless K = k. 

Proof (i) Let A be a k -variety in K n . Then for any two non-empty open sets U\ 
and U 2 in A, we have U\ D U 2 / 0. This is so because if U\ D U 2 = 0, then A = 
(A — U\) U (A — U 2 ) U (U\ IT U 2 ) = (A — U\) U (A — C/ 2 ), and hence A is the union 
of two affine k -algebraic sets: A — U\ and A — U 2 , which are different from A. In 
other words, A is not a k -variety in K n , which implies a contradiction. Therefore, 
if v and y are distinct points of A, it is not possible to find disjoint neighborhoods 
U\ and U 2 of x and y respectively. This means that A is not a Hausdorff space. 
Therefore, K n cannot be a Hausdorff space, because every subspace of a Hausdorff 
space is Hausdorff space. 

(ii) If k = K, then by Proposition 5.5. l(v), any point of K is an algebraic set and 
hence closed in the Zariski topology. If k / K, and if (a \ , <* 2 , • • • , oi n ) e K n is not 
a zero of any polynomial in k[x\ , X 2 , . . . , x n ], then the point (a\ , 0 ^ 2 , • . . , oi n ) is not 
a k -algebraic set and hence not closed. □ 

Remark The above Proposition 5.5.4 shows that any open set U of a k -variety A is 
dense inA,i.e., U = A. Also, any open set of a k -variety is connected, because it is 
not the union of two disjoint non-empty open sets. 

Problem 4 (a) Let s(I) be the set of all ideals of K[x\, X 2 , . . . , x n ] and s(A) be 
the set of all algebraic sets in K n . Then the mapping V : s(I) — > s(A), I i-> V (I) 
satisfies the following properties: 

(i) V(0) = K n . 

(ii) V(K[x i,x 2 ,...,x„]) = 0. 
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(m) v(£«j«) = nv(/„). 

(iv) V(I n J) = V(//) = V(I) U V(J). 

(b) By using (a), define the Zariski topology on K n . 

(c) Hence deduce that if K is an infinite field, then K 1 endowed with the Zariski 
topology is not a Hausdorff space. 

Solution (a) follows from Proposition 5.5.1. 

(b) As the properties (i)-(iv) satisfy the axioms for closed sets of a topology, the 
Zariski topology is defined on K n , with algebraic sets as the only closed sets. 

(c) If M is a maximal ideal of K[x\, X 2 , . . . , x n ] of the form (x\ — t\,X 2 — 
t 2 , . . . , x n — t n ), t = (t \ , t 2 , . . . , tn) e K n , then V (M) = {^}. Thus finite sets of points 
are closed sets in the Zariski topology in K n . They are the only closed sets in K l 
except K l and empty set 0. If the field K is infinite, K l endowed with the Zariski 
topology is not Hausdorff. 

Problem 5 If K is not algebraically closed, for a proper ideal IofK[x\—ti,X 2 — 
t 2 , ... ,x n — t n ], can V (I) be empty? 

Solution V (/) may be empty. For example, if K = R (which is not algebraically 
closed), (x 2 + 1) is an ideal of K[x] such that (x 2 + 1) is proper but V (. x 2 + 1) = 0. 

Problem 6 Let s(A) denote the set of all algebraic sets in K n and s(I) denote the 
set of all ideals of K[x \ , X 2 , . . . , x n ] and sr(I ) denote the set of all radical ideals of 
K[xi,...,x n \. 

(a) Then the mapping I : s(A) — > s (/), A i-> /(A) is injective. 

(b) Then the mapping I oV : sr(I) sr(I), J I (V (/)) is the identity function, 
where K is algebraically closed. 

(c) If K is an algebraically closed field and J is an ideal of the ring K[x \ , X 2 , . . . , 
x w ], then /(V (/)) = rad (/), i.e., if a polynomial vanishes at all common zeros 
of polynomials in /, then some power of / belongs to J. 

(d) If K is an algebraically closed field, then there exists a bijective correspondence 
between the algebraic sets contained in K n and the radical ideals of the ring 
K[x i, X 2 , . . . , x n ] (i.e., ideals J such that rad (/) = J). 

Solution (a) It follows from Proposition 5.5.3(iv) that 

A = V(/(A)) = (V o 7)(A), VA e s(A) 

=> V o I is the identity function on s(A ) / is injective. 

(b) Take in particular, 1(A) = J. Then / is a radical ideal in K [x\, X 2 , . . . , x n ] 
and hence by Ex. 7 of Exercises-I, sr(I) = {J : J = 1(A) for some A e s(A)}. 
Then Proposition 5.5.3 (v) => J = I(V(J)) = (I o V)(/), V7 g s r (I) / o V is 
the identity function on S 7 ?(/) / is a surjection. 
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(c) Since I (A) = J is a radical ideal (see Ex. 7 of Exercises-I), (c) follows from 
Proposition 5.5.3(v). 

(d) follows from (a) and (b). 

We summarize the above discussion in the basic and important result. 

Theorem 5.5.2 (Hilbert’s Nullstellensatz) Let K be an algebrically closed field. 
Then I (V (/)) = rad (/) for every ideal J in K[x i,X 2 , . . . ,x n ]. Moreover ; there 
exists a bijective correspondence between the set of affine algebraic sets in K n and 
the set of radical ideals in K[x i,X 2 , . . . , x n ]. 

Remark 1 Hilbert’s Nullstellensatz theorem is the first result in representing state- 
ments about geometry in the language of algebra. 

Remark 2 Hilbert’s Nullstellensatz theorem identifies the set of maximal ideals of 
the polynomial rings C[xi, X 2 , . . . , x n ] with the points in the affine space C n . 

Remark 3 Hilbert’s Nullstellensatz theorem may be considered as a multidimen- 
sional version of the fundamental theorem algebra. 

Remark 4 Hilbert’s Nullstellensatz theorem fails to be true over the field R (which 
is not algebraically closed) [see Smith et al. 2000.] 


5.6 Chinese Remainder Theorem 

It is so named for the reason that it generalizes the result in part (c) of Theorem 5.6.1, 
which is a classical result of number theory, proved by Chinese mathematicians 
during the first century AD. It provides a representation of 7u n and many interesting 
applications (see Chaps. 10 and 12). This theorem has several forms according to 
the context. Throughout this section we assume R is a commutative ring with 1. 
This theorem is used to prove structure theorems for a class of modules. 

Definition 5.6.1 Two ideals A and B of R are said to be coprime (comaximal) iff 
A + B = R. 

Proposition 5.6.1 If ideals A and B are coprime in R , then A H B = AB . 

Proof Clearly, AB CAH5. Since A + B = R, 3a e A, b e B such that a + b = 1. 
Let c e AH B. Then c = c 1 = c{a + b) = ca + cb ^ An B m AB. Hence An B = 
AB. □ 

Example 5.6.1 Let A and B be two distinct maximal ideals of R. Then A + B = R. 
[Hint. A + B D A, since A and B are distinct. Hence A + B = R.] 
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Theorem 5 . 6.1 (Chinese Remainder Theorem for rings) Let R be a commutative 

ring with 1 . 

(a) Let A and B be coprime ideals in R. Then the map f:R^R/Ax R/B , 

defined by r (r + A, r + B) is a ring epimorphism with ker / = AH B = AB 

and the rings R/ AB and R/ A x R/ B are isomorphic. 

(b) (General form) Let A\, A2 , . . . , A n be pairwise coprime ideals in R. Then the 
map /:/?—> R/A\ x R/A2 x • • • x R/A n , defined by r (r + Ai,r + 
A2, ... ,r + A n ) is a ring epimorphism with ker / = A\ (T A2 (T • • • (T A n = 
A1A2 • • • A n and the rings R/(A\A2 • • • A n ) and R/ A\ x R/A2 x • • • x R/A n 
are isomorphic. 

(c) Let n\, n2, . . . , n t be positive integers such that every pair of integers nt, nj 
are relatively prime for i j. If a\, 02 , . . . , a t are t integers , then the system 
of congruences x = a\ (mod^i), x = 02 (mod 722), • • • , x = a t (mod n t ) has a 
simultaneous solution , which is unique up to modulo n = n\n2 • • • n t . 

Proof (a) A and B are coprime ideals in R => 3a e A and b e B such that 1 = a + b. 

Clearly, / is a ring homomorphism. 

To show that / is surjective, let (x + A, y + B) e R/A x R/B. Then 

f (xb + yd) = ( xb + ya + A, xb + ya + B) 

= (x + A,y + B), since 1 = a + b. 


Clearly, ker / = A Pi B . 

Hence / is an epimorphism with ker / = APl£ = Ai?=^the rings R/ AB and 
R/A x R/B are isomorphic. 

(b) Consider the map /:/?-> R/A\ x R/A2 x • • • x R/A n , r (r + A\, r + 
A2, . . . , r + A n ). Then / is a ring epimorphism. 

Hence the first part of (b) follows from (a) by induction. 

The last part of (b) follows from the first part. 

(c) It follows from the last part of (b) by taking R = Z, A/ = (nt). □ 

Corollary 1 Let n\, n2 , . . . , n t be integers >1 such that every pair of integers nt, 
nj are relatively prime for i ^ j , i, j e { 1 , 2 ,..., t}. Ifn = n\n2 • • • n t , then ring Z n 
is isomorphic to the ring Z Wl x Z n2 x • • • x Z nt . 

Corollary 2 If n is a positive integer and p” 1 p 2 2 • • • p is its factorization into 
powers of distinct primes ( fit > 1 ), then 

Z n =Z X Z n-2 X • • • X Zn t . 

P\ P 2 Pt 

Corollary 3 If n is a positive integer and p n ^ p^ 2 • • • p ^ is its factorization into 
powers of distinct primes (nt > 1 ), then Z*=Z* ni xZ *„ 2 x • • • x Z% , where Z* 

P\ P2 Pt 

denotes the multiplicative group of all units of the ring Z n (see Chap. 10 ). 
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5.7 Ideals of C(X) 

Let C (X) be the ring of real valued continuous functions on a compact topological 
space X having the identity element c defined by the constant function c(x) = 1 
for all v g X. Then the maximal ideals in C(X) correspond to the points of X in a 
natural way. It follows from the Theorem 5.7.4. 

Theorem 5.7.1 Let C(X) be the ring of real valued continuous functions on a 
compact topological space X. 1 

(a) If M is a proper ideal of C(X), then there exists a point xo e X such that 
f{x o) = 0 for all f e A. 

(b) If M is a maximal ideal of C(X), then there exists a point xo c X such that 
M Xo = {feC(X):f(x 0 ) = 0} = M. 

Proof (a) Let f e A. Since / is continuous, / _1 (0) is a closed set in X. Then 
/ -1 (0) ^ 0 for each f e A. Otherwise, if possible, there is some / e A such that 
/ -1 ( 0) = 0. Then fix) / 0 for all x g X. So, l/f e C(X) and hence the unit ele- 
ment c = /- l//eAc CfX) implies that A = C(X). This contradicts the fact that 
A is a proper ideal of CfX). We now consider the family T = {/ _1 (0) : / g A} of 
closed sets in X. We shall show that it has the finite intersection property, i.e., ev- 
ery finite sub-family of T has a non-empty intersection. If /i , / 2 , A, then 

each ff = f • fie A and hence / = Ya=\ ff e A. We claim that f|Li /i _1 (°) = 
f~ l (0). Let x G n”=i fi~ l (0)- Then each f (x) = 0 implies each /. 2 (x) = 0, result- 
ing in fix) = 0. Thus x g / _1 ( 0). Hence p|”=i / _1 (0) ^ / _1 (0). Conversely, let 
y e f~ l (0). Then ffy) = 0 implies that each of / 2 (y) = 0, i.e., each f (y) • f (y) = 
0, resulting in each f (y) = 0, as R is an integral domain. This implies y e /- _1 ( 0) 
for each i, resulting in y e n”=i // _1 (0), which implies / _1 (0) ^ n?=i 
Thus |Xi 1 (0) = / _1 (0) / 0- Since X is compact, H{/ _1 (^) ; / G A} ^ 0. 
Hence there exists a point xo e X such that fix o) = 0 for all / e A. 

(b) Let M be a maximal ideal of C(X). Then M is a proper ideal of C(X) and 
hence by (a) there exists a point xo G X such that /(x o) = 0 for all / g M. Let 
M Xo = {/ e C(X) : /(xo) = 0}. Since the identity element which is the constant 
function c £ M Xo , M Xo is a proper ideal of C(X) such that M c M* 0 . As M is a 
maximal ideal, it follows that M = M Xq . □ 

We now study the particular case when X = [0, 1]. 

Theorem 5.7.2 Let R be the ring of all real valued continuous functions on [0, 1] 
and M x = {/ e R : fix) = 0}, where x e [0, 1]. M* w a maximal ideal of R. 


1 A topological space X is compact if and only if every collection of closed sets in X possessing the 

finite intersection property, the intersection of the entire collection is non-empty (see [Chatterjee 
et al. (2003)]). 
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Proof Let f,ge M x and h e R. Then f(x ) = 0 = g(x). Now (/ + g)(x) = /(x) + 
g(x) = 0 and (/ • h)(x) = fix) • h{x) = 0 = (h • f)(x). Hence is an ideal of 
7?. Let N be an ideal of R such that M x C N c R. Then 3 a function g e N such 
that g £ M x . Hence g(x) = r ^ 0. Consider a function h e R defined by h(y) = 
g(y) — r . Then h is a continuous function such that h(x) = g(x) — r = r — r = 0 =>► 
h e M x C tv. Thus g,h e N ^ g — h e N, since TV is an ideal of 7? =>► r e TV. Again 
r ^ 0 =>► r _1 exists =>► rr -1 GiV=^ the constant function c(x) = le J /V=^A = 7? 
(this is so because lGiV=^/’leiVV/G7?=^7?ciV. Again ACT?. Hence 
N = R). Consequently, M* is a maximal ideal of 7?. □ 

Corollary For every x e [0, 1], /7z^ zVfea/ M* is also prime. 

We now show that the maximal ideals in the ring R = C([0, 1]) correspond to 
the points in [0, 1]. 

Theorem 5.7.3 Let M. be the set of all maximal ideals of the ring R = C([0, 1]). 
Then there exists a bijective correspondence between the elements of M and the 
points of[ 0, 1]. 

Proof To prove the theorem, it is sufficient to show that the mapping ijr : [0, 1] — >► 
M defined by \f(x) = M x = {/ e R : fix) = 0} is a bijection, where M is the set 
of all maximal ideals of the ring R. By Theorem 5.7.1, f is well defined. We claim 
that f is injective. As [0, 1] is compact and Hausdorff, by the Urysohn Lemma 2 for 
each y if^x) in [0, 1], there is a function / in 7? such that fix) = 0 but f(y) 0. 
This shows that the mapping is injective. To show that the mapping is onto, let M be 
a maximal ideal in R. We claim that for this M, there is a point y in [0, 1] at which 
every function in M vanishes. If possible, for each point x in [0, 1] there exists a 
function f in M such that fix) ^ 0. Since / is continuous, x has a neighborhood 
such that at no point / vanishes. Varying jc, we obtain an open cover of [0,1]. As 
[0, 1] is compact, this open cover has a finite subcover and hence we obtain corre- 
sponding functions /i , / 2 , . . . , f n in M. Again as f is in M, each f f = f 2 e M. 
Then the function g = YTi=\ f} ^ M is such that g(x) > 0 for all x e [0, 1]. Hence 
g can not lie in a proper ideal of 7?. It contradicts the fact that g lies in the proper 
ideal M. This contradiction implies that there exists a point x e [0, 1] such that every 
function in M vanishes at x. Hence M = M x shows that if/ is onto. Consequently, 
x/s is a bijection. □ 

Corollary 1 The maximal ideals in the ring R of all real valued continuous func- 
tions on [0, 1] correspond to the points in [0, 1]. 


2 Urysohn Lemma Let X be a normal space, and A and B be disjoint closed sub spaces of X. 
Then there exists a continuous real function f defined on X, all of whose values lie in [0, 1], such 
that fiA) = 0 and fiB) = 1. [Simmons 1963] 
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Corollary 2 Given a maximal ideal M in R, there exists a real number x £ [0, 1] 
such that M = M x . 

Proof It follows from Theorems 5.7.3 (also from Theorem 5.7.1 when X = 

[ 0 , 1 ]). □ 

Proceeding as above, we can generalize Theorem 5.7.3 for an arbitrary compact 
topological space X . 

Theorem 5.7.4 Let X be a compact topological space and R = C ( X ) be the ring 
of all real valued continuous function on X. Then there exists a bijective correspon- 
dence between the maximal ideals of R and the points of X. 

Proof Let X be a compact topological space. Then for each x e X, there is a max- 
imal ideal M x e C(X) defined by M x = {/ e C(X) : f(x) = 0}. Conversely, given 
a maximal ideal M e C (X) there exists a point relat which every function in M 
vanishes, i.e., M = M x . Hence there is a bijective correspondence between the set 
of maximal ideals of C (X) and the points of X. □ 

Corollary The maximal ideals in C(X) correspond to the points of X. Moreover ; 
given a maximal ideal M in C (X) there exists a point x e X such that M = M x . 


Problems and Supplementary Examples (SE-I) 

1 (i) Let R be a ring and 


/lC/ 2 C/ 3 C...C/ B ... 


(5.2) 


be an ascending chain of ideals of R. If I = |J I n , then I is an ideal of R. 

(ii) Deduce that in a principal ideal domain there cannot exist an infinite sequence 
of ideals: 

h § h £ • • • with I n C l n+1 for all n e N + . (5.3) 

For part (i), suppose I = |J W I n . Then 7/0. Let a,b e 7. Then 3 integers s, t such 
that a e I s , b e I t . Hence either s < t or t < s. For definiteness, let s < t. Then 
a — b e I t and ar, ra e I t Vr e R => a — b e I , ar, ra e I => I is an ideal of R. 

For part (ii), suppose there exists an infinite sequence of ideals of type (5.3) in 
a principal ideal domain R. Then I = (J I n is an ideal of R and hence I = (x) for 
some x e I. Consequently, x e 7 m for some m e N + . This shows that I c I m . Thus 
7 w +i ^ 7 c I m => a contradiction. 

2 If A and B are non-null ideals of an integral domain R , then A H B {0}. 

Solution Let a e A and b e B be two non-zero elements. Then R is an integral 
domain => ab / 0 and A, B are ideals of R => ab e A and abeB^abeAHB. 
Consequently, A D B / {0}. 
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3 Let A be an ideal of a ring R . Then the ring R/ A is commutative O xy — yx e A 
Vx, y e R. Deduce that if B and C are ideals of R and both R/B and R/C are 
commutative rings, then R/(B D C) is also commutative. 

Solution The quotient ring R/A is commutative (x + A)(y + A) = (y + A)(x + 
A) Vx, y e R O (xy + A) = yx + A o xy — yx e A Vx, y e R. 

For the second part, the hypothesis implies by the first part that Vx,y e R , 
xy — yx e B and also xy — yx e C and hence xy — yx e B Pi C => R/(B Fi C) 
is commutative by the first part. 

4 Let A and B be prime ideals in a commutative ring R. If A D B is prime in R , 
then A c or 5 c A. 

Solution If A ^ 5, we claim that 5CA. AgS=^3aeA such that a £ B. Let 
b e B. Then a^eAnfi=^aGAn5or|jeAn5, since A n 5 is prime. Since 
a^,kAHficA^5CA. 

5 (a) Let D be an integral domain and a (^0) e D be not invertible. Then (a n + l ) C 
(a n ), for all n e N + . 

Solution Clearly, (^C +1 ) c (a n ). If possible (a w+1 ) = (a' 1 ), then a n e (a n + l ) =>► 
_ a ^+i r f or some r g D =>► 1 = ar by Theorem 4.1.4 =>► a is invertible in D => 
a contradiction. 

(b) Deduce from (a) that 

(i) there are infinitely many different ideals in an integral domain D which is not a 
field; 

(ii) every finite integral domain is a field (cf. Theorem 4.1.6). 

Solution (i) As D is not a field, there is a non-zero non-invertible element a in D. 
Using (a), (a), (a 2 ), (a 3 ), ... are all different ideals of D. 

(ii) follows from (i). 

6 Let A, B and C be ideals in a ring R. Then 

(i) A + B + C and ABC are ideals; 

(ii) (A + £) + C = A + (£ + C); 

(iii) (AB)C = ABC = A(BC); 

(iv) A(B + C) = AB + AC and (A + B)C = AC + BC. 

[Hint. Use Definitions 5.1.4 and 5.1.5.] 

7 Let A = {fix) e Z[x] : /( 0) = 0 (mod 2)}. Then A is an ideal of Z[x] and 

A = (2, x). 

Worked-Out Exercises 

1. Let R be a ring with 1 such that R has no non-trivial left (right) ideals. Then R 
is a division ring. 
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Solution Let x (^0) g R. Then I = Rx = {rx : r g R} is a left ideal of R. Now 
x — \ ■ x ^ I ^ {()} ^ 1 — R by hypothesis 1 = yx for some y e R => x has 
a left inverse in R => every non-zero element of R has a left inverse in R => 
non-zero elements of R form a multiplicative group (see Proposition 2.3.2 of 
Chap. 2) => R is a division ring. 

A similar result holds if R has no non-trivial right ideals. 

2. Let R = M 2 (Q) be the ring of 2 x 2 matrices over Q under usual addition and 
multiplication. Then R is a simple ring but not a division ring. 

Solution Since R contains divisors of zero, R cannot be a division ring. On the 
other hand, let I ^ {0} be an ideal of R . Then there exists a non-zero matrix M = 
(^)c/.AsM^(qq), without loss of generality, suppose a ^ 0. Consider the 
matrices in R : 


En = 


1 0 

o or 


E 12 = 


0 1 
0 0 


E2\ = 


0 0 
1 0 


E 22 = 


0 0 
0 1 


Now, M e I =► EnMEu = (““) 6 / =► E n = (*{j) = (a -1 /2)(^) 6 7 ’ 
where / 2 is the identity matrix in R. Similarly, E i 2 , £ 2 i, £ 22 g /. Thus any ele- 
ment A = g R, can be expressed as A = xE\\ + yE \2 + ZE 21 + 7£ 22 . 
Hence A g / => R c /. But / c 7?. 

Thus I = R ^ R has no non-trivial ideals =>- 7? is a simple ring. 

3. The ring Z x Z is not an integral domain but it has a quotient ring which is an 
integral domain. 

Solution R = Z x Z is a ring under component- wise addition and multiplication: 
(a, b) + (c, d) = {a + c, b + d), (a, Z?) • (c, d) = (ac, bd), Va, b,c,d G Z. Then 
(0, 1) and (1,0) are non-null elements of R such that (0, 1)(1, 0) = (0, 0) => R 
contains divisors of zero => R is not an integral domain. Consider I = {(0,n) : 
n G Z}. Then I is an ideal of R =>- /?// is a quotient ring. Then the map / : Z — >► 
7?// defined by /(ft) = (n, 0) + /, Vh g Z is a ring isomorphism. Hence Z is an 
integral domain =>► /(Z) = R/ 1 is an integral domain. 

4. The rings (Z w , +, •) and (Z/ (n), +, •) are isomorphic. 

Solution Define a map i/s : Z n -> Z/ (n) by ^((m)) =m + (n) = m + nZ. Then t/r 
is well defined and a bijective mapping such that V^((m + r)) = i/^((m)) + \j/ ((r)) 
and ^((mr)) = ^((m))i^((r)), V(m), (r) g Z w . Hence x/r is a ring isomorphism. 

5. Let R be the ring R = {( q ” ) • m,n, p gZ} and I be an ideal given by I = 

{ (q ” ) :n, p e Zj. Then the quotient ring R/I is isomorphic to Z. 

Solution I is an ideal of R =>► R/I is a ring. Consider the mapping / : R -> Z 
defined by/((o”)) = m. Then / is a ring epimorphism. 


ker/ = 
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Thus / is not a monomorphism =>► / is not an isomorphism. But by the First 
Isomorphism Theorem for rings, 3 an isomorphism f : R/ 1 Z defined by 
/(r + /) = /(r). 

6. Let 7? be a commutative ring with 1 and I be an ideal of R. If (I,x) = I[x] 
denotes the ideal of R[x] generated by I and x, then show that rings R[x]/ 1 [x] 
and (R/ 1) [x] are isomorphic. 

Solution Consider the natural map x/r : R[x] -> ( R/I)[x ] given by reducing each 
coefficient r of a polynomial in 7?[x] to the corresponding coefficient r + 1 of the 
polynomial in ( R/I)[x ]. Clearly this map x)/ is a ring epimorphism. The result 
follows from the First Ring Isomorphism Theorem. 

7. In the ring Z[x], (x) is a prime ideal but not a maximal ideal. 

Solution Consider the mapping / : Z[x] -> Z defined by f(ao + a\x + • • • + 

a n x n ) = clq. Then / is a ring epimorphism such that ker f = {ao + a\x H + 

a n x n g Z[x] : ao = 0} = {a\x + • • • + a n x n g Z[x]} = (x). Hence by the Epi- 
morphism Theorem (First Isomorphism Theorem), the rings Z[jc]/(jc) and Z are 
isomorphic. So, Z is an integral domain =>► Z[x]/(x) is an integral domain => (x) 
is a prime ideal in Z[x] by Theorem 5.3.1. 

To the contrary, Z is not a field =>► Z[jc]/(jc) is not a field => (x) is not a 
maximal ideal in Z[x\ by Theorem 5.3.2. 

8. Any integral domain D with only a finite number of ideals is a field. 

Solution If possible, let D be not a field. Then 3 a non-zero non-invertible el- 
ement a in D. Then {a), (a 2 ), {a 3 ),..., (a n ), . . . are infinitely many different 
ideals in D (see Ex. 5(a), (b) of SE-I) => a contradiction of hypothesis. Hence D 
is a field. 

9. Let D be an integral domain and a,b e D* = (D — {0}). Then ( a ) = {b) iff 
ab~ l G U (D) = {x : x is a unit in D}. 

Solution Suppose (a) = (b). Then a = be for some c e D and b = ad for some 
d G D. Hence b ^ 0 and b = =^1=cJ=^cgI/ (D) =>► a/? -1 G 1/ (D). 

Conversely, let g t/ (D). Then a = be for some c eU (D). Let x g (a). 
Then x = ad for some d G D =>► x = bed => x G (Z?) =>► (a) c (Z?). Again Z7 (D) 
is a group and aZ? -1 g Z7 (D) Z?< 2 _1 = ( ab~ l )~ l g Z7 (D) =>► (Z?) c (a). Hence 

<*) = <*>. 


5.8 Exercises 

Exercises-I 

1. Show that the ideal (a) generated by a single element a of an arbitrary ring R 
is given by 
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If R has an identity, show that 

(a) = \ r,aSi : r, ,s t eR 
^ finite 

If R is a commutative ring with 1, show that (1) = R. 

2. For any element a belonging to ring (/?,+,•), show that 

(i) ( aR , +, •) is aright ideal of ( R , +, •); 

(ii) ( Ra , +, •) is a left ideal of (/?,+, •)• 

3. Let R be a ring, A/0 a subset of R. An element c e R is said to be a left 
(right) annihilator of A, iff c <2 = 0 (ac = 0) Va e A. 

Show that both annihilator s of R form an ideal of R . 

4. Prove that 

(a) A ring R with identity is a division ring iff R has no non-trivial left ideals; 

(b) If D is a division ring, then R = M nn (D) has no non-trivial left ideals. 

[Hint, (a) For any a (^0) 6 R, Ra is a left ideal of R. Now use the Corollary 
of Proposition 2.3.2 of Chap. 2.] 

5. (a) Let R be an integral domain in which every proper ideal is prime. Show 

that R is a field. 

(b) If 7 and K are non-null ideals of an integral domain, prove that 7 D K / 
{0}. 

6. (a) Let 7 be an ideal of a commutative ring R. Then the radical of 7, denoted 

rad(7), is defined by rad(7) = {a e R : a n e I for some integer n > 0}. 
Show that rad(7) is an ideal of R such that 7 c rad(7). 

(b) Prove that the ideal 7(A) of any subset A of an affine n- space K n (A is 
not necessarily an algebraic set) is a radical ideal (see Chap. 5 or Ex. 7 of 
Exercises-I). 

[Hint, (a) Since the binomial theorem holds in a commutative ring, use this 
theorem. 

(b) / e 7(A) ^ fix) = 0 V(x) gA^ (/(*))" = 0 V(x) 

I (A) => f € rad(7 (A)).] 

7. An ideal 7 of a commutative ring is called a radical ideal iff 7 = rad(7). 

Let K be an algebraically closed field and K n the cartesian product 
of K with itself n times and A c K n . Define 7(A) by 7(A) = {/ e 
K[x i,X 2 , . . . , x n ] : f{x) = 0 V(x) e A}. Then 7(A) is an ideal of K[x i,X 2 , 
x n ] called the ideal of A. Show that 7(A) is a radical ideal. 

8. If / : R — >► S is a ring homomorphism, 7 is an ideal of R and J is an ideal of 
S such that /(7) c /, show that / induces a ring homomorphism / : R/ 1 — > 
5//, given bya + 7h^ /(&) + /. Prove that / is an isomorphism iff /(/?) + 
7 = S' and f~ l (J) c 7. In particular, if / is an epimorphism such that /(7) = 
7 and ker / c 7, then show that / is an isomorphism. 

[Hint. See Proposition 2.6.1 of Chap. 2.] 

9. Let 7 and 7 be ideals in a ring R. Prove that 
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(i) (Second Isomorphism Theorem). There is an isomorphism of rings 1/(1 D 
/) = (/ + /)//; 

(ii) (Third Isomorphism Theorem). If I c /, then J/I is an ideal of R/I and 
there is an isomorphism of rings ( R/I)/(J/I ) = R/7. 

[Hint. See Theorems 2.6.7 and 2.6.8 of Chap. 2.] 

10. If / is an ideal of a ring R , prove that there is a one-to-one correspondence 
between the set of all ideals of R containing I and the set of all ideals of R/ 1 . 
Hence show that every ideal of R/ 1 is of the form T/I, where T is an ideal of 
R containing I . 

[Hint. Take S = R/ 1 and / = tt : R —> R/ 1 in Theorem 5.2.4 or see Corol- 
lary or Remark of Theorem 5.2.4.] 

11. (a) Let R be a commutative ring with identity and I be a prime ideal of R 

with the property that R/ 1 is finite. Prove that I is a maximal ideal of R. 

[Hint. I is prime =>► R/I is an integral domain =>► R/I is a field (as 
R/7 is finite) => I is maximal.] 

(b) Let R be a commutative ring with identity and I be a proper ideal of R. 
Show that I is a maximal ideal (I U {a}) = R for every element a £ I . 

(c) Show that an ideal I in a commutative ring R is prime iff for every pair of 
ideals A and B of R such that AB c I =>► either A c / or 5 c /. 

12. Let R and S' be commutative rings with identity. If / : R -> S is a ring ho- 
momorphism such that f(R) is a field, show that ker / is a maximal ideal 
of R. 

[Hint. Clearly, ker / is an ideal of R. Let I be an ideal of R such that ker / C 
I C.R. 

Then 3 an element a e I but a £ ker/. Then f(a) / 0 and f(a) e f(R ) 
which is a field and hence 3x e R such that f (a) f(x) = /( 1) ^ ax — 1 e 
ker / c /. Now aut e /=>► 1 e /=>>/ = /L] 

13. Prove that a proper ideal M of a commutative ring with 1 is maximal iff for 
every element r £ M, there exists some a e R such that 1 + ra e M. 

[Hint. Let M be a maximal ideal of R and r £ M. Then Mc(M, r) = R =>► 
1 = m + rx for some m e M and jc e 7?.] T 

14. Let R be the ring of all real valued continuous functions defined on the closed 
interval [0,1]. Let I = {f e R : f(^) = 0}. Show that I is a maximal ideal 
of R. 

15. Let Z[x\ be the ring of polynomials over the ring of integers Z and I be the 
principal ideal of Z[x] generated by x i.e., I = (x). Show that (jc) is a prime 
ideal but not maximal. 

16. Let R be a commutative ring with 1. Show that there is a one-to-one corre- 
spondence between the maximal ideals M of R and the maximal ideals M' of 
R{x} in such a way that M r corresponds to M iff M r is generated by M and 
x, i.e., M' = (M, jc). 

[Hint. Let M be a maximal ideal of R. By using Ex. 13 of Exercises-I, show 
that M' = (M, x) is a maximal ideal of R [jcJ . Next, let M r be a maximal ideal 
of R{xj. Define the set M = {ao e R : J2 a i xl £ M} i.e., M consists of the 
constant terms of power series e M r . Show that M is a maximal ideal of R. To 
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verify that the given correspondence is one-to-one, show that (M, x) = (M, x) 
for maximal ideals M and M of R => M = M.] 

17. If p is a prime and n is an integer >1, show that Z p n is a local ring with unique 
maximal ideal (p). 

18. Let R be a commutative ring with 1. Show that the following statements are 
equivalent: 

(i) R is a local ring; 

(ii) all non-units of R are contained in some ideal M ^R \ 

(iii) the non-units of R form an ideal of R . 

19. Let R and S be two commutative rings with identity. If / : R — > S is a ring 
epimorphism, show that 

(i) S is a field ker / is a maximal ideal of R ; 

(ii) S is an integral domain ker / is a prime ideal of R. 

[Hint. By the Note of Theorem 5.2.3, R / ker f = S. Now for I = ker /, 
R/ 1 is a field ker / is maximal and R/ 1 is an integral domain ker / 
is prime.] 

20. (a) Let R be a commutative ring with identity 1. Then the nil radical 

of an ideal I of R, denoted by a/7, is the set \fl = {r £ R : r n £ 
I for some positive integer n (depending on r)}. Show that V7 is an ideal 
of R and I c a/7 (see Ex. 6(a) of Exercises-I). 

(b) A proper ideal I of R is said to be a semiprime ideal iff I = \fl . Prove 
that a proper ideal I of R is a semiprime ideal iff any one of the following 
conditions holds: 

(i) for any a £ R, a (i) 2 £ I implies a £ /; 

(ii) the quotient ring R/ 1 has no non-zero nilpotent element. 

(c) Let R be a commutative ring with 1 . A proper ideal I of R is said to be 
a primary ideal iff for a, b £ R, ab £ I and b £ I a n e I for some 
positive integer n. Clearly every prime ideal is a primary ideal. Show that 
(4) is a primary ideal of (Z, +, •); determine all its primary ideals. Prove 
that a proper ideal I of R is a primary ideal, iff in the quotient ring R/ 1 
every non-zero divisor of zero is a nilpotent element. 

Further prove that if I is a primary ideal of R , then V7 is a prime ideal 
of R. 

[Hint. Let I be a primary ideal of R and a, b £ R be such that ab e \fl . 
Then ( ab) n = a n b n £ I for some integer n > 1. Then either a n e I or 
( b n ) m £ I for some integer m > 1. Hence either a £ \fl or b £ \/7.] 

21. Let R be a commutative ring with 1. Show that 

(i) every prime ideal of R is a semiprime ideal but its converse is not true; 

(ii) every prime ideal of R is a primary ideal but its converse is not true; 

[Hint. Consider the ring (Z, +, •)• (6) is a semiprime ideal but not a 
prime ideal and (4) is a primary ideal but not a prime ideal.] 
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22. Let R be a commutative ring with 1. Then the Jacobson radical of a ring R , 
denoted by rad R, is defined by rad R = p | {M \ M is a maximal ideal of 7?}. 
Clearly, rad R / 0, since R contains at least one maximal ideal by the Corol- 
lary of Theorem 5.3.6 (Krull-Zorn). 

If rad R = {0}, then R is called a semisimple ring. 

Show that (Z, +, •) is a semisimple ring. 

Show that rad7?[v] = (rad R,x). Hence show that if F is a field, then 
radFjx] = (jc). 

[Hint. For R = Z, the maximal ideals of Z are precisely (p), where p is 
a prime number. Then rad R = radZ = p|{(/?) : P * s a prime number} = {0}, 
since no non-zero integer is divisible by every prime. For the next part, 
rad/?[jc] = P|{ M' : M' is maximal ideal of Rpc}} = p|(M, x) (by Ex. 16 of 
Exercises-I) = (rad R, x). In particular, if R is a field F , then radE[x] = (x) 
(by the Corollary of Theorem 5.4.1).] 

23. Let R be a commutative ring with 1. Prove the following: 

(i) Let A be an ideal of R. Then A c rad R each element of the coset 
1 + A has an inverse in R ; 

(ii) An element a e rad R O 1 — ra is invertible for each r e R \ 

(iii) An element a is invertible in R O the coset a + rad R is invertible in the 
quotient ring R / rad R ; 

(iv) 0 is the only idempotent in rad R. 

24. Let I be a primary ideal in a commutative ring R with 1 . Show that every zero 
divisor in R/ 1 is nilpotent in R/ 1 . 

[Hint. Let (a) e R/I be a zero divisor. Then 3(b) ^ (0) e R/I such that 
(a)(b) = (0). Hence ab e I => a n e I for some positive integer n , since b £ I. 
Hence (a) n = (0).] 

25. Let /, J and K be ideals of a ring R satisfying (i) / c K (ii) I D J = I Pi K 
and (iii) J/I = K/I. Show that J = K. 

[Hint. To show K c /, take an arbitrary element k e K. Then 3 an element 
j e J such that k + I = j + / (by (iii)) =>► k — j = i for some i e I. Again 
k — j e K by (i). Consequently, i = k — jeinK = inJ=>k = i+ j e 
J^K^J^J = K by (i).] 

26. Let I and / be two ideals of a ring R. Show that the sets 

(i) I : r J = {a e R : aJ c 1} ( right quotient of I by J) and 

(ii) / : // = {a e R : Ja c /} (left quotient of I by J) are ideals of R. 

If R is a commutative ring, then we simply write I : J [see Theorem 7.1.7 
(Cohen) of Chap. 7]. 

Exercises A (True/False Statements) Determine the correct statements with jus- 
tification'. 

1. Let R be a commutative ring with 1. If R has no non-trivial ideals, then R is a 
field. 
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2. Let R be a commutative ring with 1. Then every prime ideal of R is a maximal 
ideal. 

3. Let / : R — > S' be a ring homomorphism. Then A is maximal in R =>- /(A) is 
maximal in 5*. 

4. Let / : 7? — ► S be a ring epimorphism. Then B is maximal in S =>- is 

maximal in R. 

5. Let and S be commutative rings with identity elements and / : R -> S be a 
ring epimorphism. Then 5 is an integral domain ker / is a prime ideal in R. 

6. Let A be a two-sided maximal ideal of a ring R. Then R/A is a simple ring. 

7. R is a simple ring with 1 =>- R is a division ring. 

8. The rings Z n and Z/wZVnG N + (ft > 1) are isomorphic. 

9. Z/ (ft) is a quotient ring of Z/ (mft), Wm, n e N + . 

10. Let R = C([0, 1]) be the ring of real valued continuous functions on [0, 1] and 
A = {/ g R : /( jtj) = 0}. Then A is a maximal ideal in 7?. 

1 1 . There exist only a finite number of proper ideals in an integral domain which is 
not a field. 

12. Any integral domain having only a finite number of proper ideals is a field. 

13. (Z, +, •) is an ideal of (Q, +, •)• 

14. A ring R has zero divisors => every quotient ring of R has zero divisors. 

15. Let R be a commutative ring with 1. Then R is a simple ring =>► R is a field. 

16. Let R be a commutative ring with 1. An element a (y^O) e R is invertible in 
R O a £ M for any maximal ideal M of R. 

Exercises B Identify the correct alternative(s) (there may be more than one) from 
the following list: 

1. Let R = (C([0, 1]), +, •) be the ring of all real valued continuous functions on 
[0, 1] and M = {/ e M : /(1/2) = 0}. Then 

(a) M is not an ideal of R . 

(b) M is an ideal of R but not a maximal ideal or R . 

(c) M is a maximal ideal or R . 

(d) M is an ideal of R but not a prime ideal or R . 

2. Let (Z, +, •) be the ring of integers. Then Z has 

(a) Only one proper ideal. 

(b) Finitely many proper ideals. 

(c) Countably infinite many proper ideals. 

(d) Uncountably many proper ideals. 

3. Let R be a commutative ring with identity and F be a field. If / : R -> F be a 
(ring) epimorphism, then K = ker / = {x e R : f(x) = 0/7 } is 

(a) A maximal ideal of R but not a prime ideal of R. 

(b) A prime ideal of R but not a maximal ideal of R. 

(c) A both maximal and prime ideal of R . 

(d) Not a maximal ideal of R. 
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4. Let (Z[x], +, •) be the ring of polynomials over Z and I = (x). Then 

(a) I is a prime ideal of Z[x] but not a maximal ideal. 

(b) I is a maximal ideal of Z[x] but not a prime ideal. 

(c) I is a both maximal and prime ideal of Z[x\. 

(d) I is not a prime ideal of Z[x]. 

5. Let R be an arbitrary commutative ring with identity and I be a prime ideal of R 
such that R/I is finite. Then 

(a) I is a maximal ideal of R. 

(b) I = R. 

(c) R/I is an integral domain but not a field. 

(d) R/I is not an integral domain. 

6. Let / : (Z, +, •) — > (Z/(n ),+,•) be the ring homomorphism defined by /((m)) = 
m + (n). Then 

(a) / is a monomorphism but not an epimorphism. 

(b) / is an epimorphism but not a monomorphism. 

(c) / is an isomorphism. 

(d) / is neither a monomorphism nor an epimorphism. 

7. Let 

H(o :)-■ 

Then the ring R and its 

(a) A is an ideal of R 
R/A. 

(b) A is not an ideal of R. 

(c) A is an ideal of R such that the ring Z is not isomorphic to the quotient ring 
R/A. 

(d) The quotient ring R/A is a field. 

8. Let 

"={(o 

be the ring of upper triangular matrices of order two over Z and 

H(“ ‘LH- 


m,n,pe zj and ^J:n,/?ezj. 

subring A are such that 

such that the ring Z is isomorphic to the quotient ring 


Then 

(a) A is not an ideal of M. 
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(b) A is an ideal of M such 

m/ H(s J0 +a:o H- 

(c) A is an ideal of M such M/A is isomorphic to Z. 

(d) A is an ideal of M such M/A is not an integral domain. 

9. Let Z[x] be the polynomial ring over Z. Then 

(a) Z is a subring of Z[x] but not an ideal of Z[x]. 

(b) Z is a left ideal of Z[x] but not a right ideal of Z[x]. 

(c) Z is a right ideal of Z[x\ but not a left ideal of Z[x\. 

(d) Z is an ideal of Z[x]. 

10. Let (I (x)) be the ideal generated by the polynomial I (x) in Q[x] and fix) = 
x 3 + x 2 + v + 1 and g(x) = x 3 — x 2 + x — 1 be two polynomials in Q[x]. Then 

(a) ( f(x)) + {g(x)) = (x 3 +x ). 

(b) (f(x)) + ( g(x )> = (x 4 - 1 ). 

(c) </(*)> + (g(x)) = (x 2 + l>. 

(d) (fix)) + (gix)) = ( f(x).g(x )). 

11. Which of the following statement(s) is (are) true? 

(a) The set of all 2 x 2 matrices with rational entries (with the usual operations of 
matrix addition and matrix multiplication) is a ring which has no non-trivial 
ideals. 

(b) Let R = C([0, 1]) be considered as a ring with the usual operations of 
pointwise addition and pointwise multiplication. Let I = {/ : [0, 1] -> R | 
f (1/2) = 0}. Then I is a maximal ideal. 

(c) R[x] is a field. 

(d) Let R be a commutative ring and let P be prime ideal of R. Then R/P is an 
integral domain. 

12. Consider the ring Z n forn > 2. If 

(a) Z n is a field, then n is a composite integer. 

(b) Z n is a field iff n is a prime integer. 

(c) Z n is an integral domain, then n is prime integer. 

(d) There is an injective ring homomorphism of Z 5 to Z w , then n is prime. 

13. Let C([0, 1]) be the ring of continuous real- valued functions on [0, 1], with 
addition and multiplication defined pointwise. For any subset S of C([0, 1]) let 
K(S) = {fe C([ 0, l])|/(x) = 0 for all x e S}. If 

(a) K(S) is an ideal in C([0, 1]), then S is closed in [0, 1]. 

(b) S has only one point, then K (S) is a prime ideal but not a maximal ideal. 

(c) K(S) is a maximal ideal, then S has only one point. 

(d) S has only one point, then K (S) is a maximal ideal. 
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14. Let C([0, 1]) denote the ring of all continuous real- valued functions on [0, 1] 
with respect to pointwise addition and pointwise multiplication. Then 

(a) C([0, 1]) contains divisors of zero. 

(b) For an element a e [0, 1], the set K = {/ e C([0, l])\f(a) = 0} is an ideal 
in C([0, 1]). 

(c) C([0, 1]) is an integral domain. 

(d) For any proper ideal M in C([0, 1]), there exists at least one point a e[ 0,1] 
such that f(a) = 0 for all f e M. 

15. Let R be a (commutative) ring (with identity). Let A and B be ideals in R. 
Then 

(a) A U B is an ideal in R. 

(b) AH B is an ideal in R. 

(c) AB = [xy : x e A, y e B} is an ideal in R. 

(d) A + B = {x + y : x e A, y e B} is an ideal in R. 


5.9 Additional Reading 

We refer the reader to the books (Adhikari and Adhikari 2003, 2004; Artin 1991; 
Atiya and Macdonald 1969; Birkoff and Mac Lane 1965; Burtan 1968; Chatterjee 
et al. 2003; Dugundji 1989; Fraleigh 1982; Fulton 1969; Herstein 1964; Hunger- 
ford 1974; Jacobson 1974, 1980; Lang 1965; McCoy 1964; Simmons 1963; van der 
Waerden 1970; Zariski and Samuel 1958, 1960) for further details. 
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Chapter 6 

Factorization in Integral Domains 
and in Polynomial Rings 


In Chap. 1 we have already discussed divisibility and factorization of integers. It is 
natural to ask whether an abstract ring can have such factorization. Relatively few 
such rings exist. In this chapter we are able to extend the concepts of divisibility, 
gcd, 1cm, division algorithm, and Fundamental Theorem of Arithmetic for integers 
to different classes of integral domains. The main aim of this chapter is to study 
the problem of factoring the elements of an integral domain as products of irre- 
ducible elements. We also generalize the concept of division algorithm for integers 
to arbitrary integral domains by introducing the concept of Euclidean domains. We 
also study the polynomial rings over a certain class of important rings and prove 
Eisenstein’s irreducibility criterion, the Gauss Lemma and related topics. Our study 
culminates in proving the Gauss Theorem, which provides an extensive class of 
uniquely factorizable domains. 


6.1 Divisibility 

Ideals of integral domains play an important role in the study of factorization theory. 

Throughout this chapter R denotes an integral domain unless otherwise stated 
and R ' denotes R \ {0}. 

The concept of divisibility has been developed with an eye on generalizing a similar 
concept already discussed for integers. This concept introduces the notions of divi- 
sors, prime and irreducible elements, associated elements, 1cm and gcd properties, 
and unique factorization domains (UFD). They are studied in this section with the 
help of the theory of ideals. For example, prime and irreducible elements in an inte- 
gral domain are characterized by prime and maximal ideals, respectively. Moreover, 
the Fundamental Theorem of Arithmetic is generalized for a PID (Theorems 6.1.5 
and 6.1.6). 

Definition 6.1.1 Let R be an integral domain and let R' = R \ {0}. A non-zero 
element a e R is said to divide an element b e R (denoted by a\b) in R iff there 
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exists an element c e R such that b = ac. The elements a and b g R' are said to be 
associates (or associated elements) iff a\b and b\a. If a\b, we sometimes say that b 
is divisible by a in R or a is a divisor of b in R. 

Example 6.1.1 In (Z, +, •), 3| — 3 and — 3 1 3 =>► 3 and —3 are associates. 

Theorem 6.1.1 Let R be an integral domain and a,b,u G R' . Then 

(i) a\a; 

(ii) a\b and b\c =y a\c (c G 7?); 

(iii) <z|Z? and a\c a\(bx + cy) for every x,y e R‘, 

(iv) u is an invertible element in R =>► u\d for every d g R', 

(v) u is an invertible element in R Ou\\\ 

(vi) a\c (c G R) O (c) c (a) (i.e., Rc c Ra); 

(vii) a and b are associates a = bv for some invertible element v e R. 

Proof (i) a = al =y a\a (since 1 e R). 

(ii) a\b and b\c =>- 3x, y e R such that b = ax and c = by c = axy =>- #|c 
(since xy g R). 

(iii) Trivial. 

(iv) d = d 1 = d(u~ l u) (as w is invertible in 7?) =u(du~ l ) = wc (where c = 
dw _1 g 7?) =>- u\d for every d e R. 

(v) 1 = uu~ l = uc (where c = u~ l g 7?) =>► w|l. Again w|l 1 = uc for some 
c G R inverse of m is c g 7?. So, w is invertible in R. 

(vi) a\c c = ra (for some r e R) e Ra ^ Rc ^ Ra ^ (c) ^ (a). Again, 
(c) c. (a) => c e (a) => c e Ra => c = ra for some r g R => a \c. 

(vii) a and b are associates => a\b and b\a 7? = ac and a = bd for some c,d e 

R => a = acd =>• 1 = cd (by the cancellation law in R) =>• c and d are both invertible 
=>► a = Z?v, where v = d. Again a = bv (where v is invertible in 7?) =>- b\a. Again v 
is invertible in R =>- v _1 e R ^ av~ l = bvv~ l =>- b = av -1 =>► □ 

Remark 1 a and b are associates in 7? (a) = (/?). 

Remark 2 u is a unit in 7? («) = 7?. 

Remark 3 For arbitrary commutative ring with 1, the above results (i)-(vi) are also 
valid but the result (vii) is not so. 

Problem 1 Find the invertible elements of the Gaussian ring Z [/]. 

Solution Clearly, Z[/] is a commutative ring with identity having no zero divisor. 
So, Z [/] is an integral domain. Again w = a + bi is an invertible element of Z [/] =>► 
(<2 + Zd)|l =>- 1 = (a + bi){c + di) for some c + d/ G Z[/] =>- 1 = (a — bi)(c — di) 

1 = (a 2 + b 2 )(c 2 + d 2 ) =>► a 2 + b 2 = 1 =>► either a =>- ±1 and b = 0 or a = 0 and 

7? = ±1 =>> 1, — 1, /, and —i are the only invertible elements of Z [/]. 
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Definition 6.1.2 Let R be a commutative ring with 1. An element q e R is said to 
be irreducible in R iff 

(i) q is a non-zero non-unit; 

(ii) whenever q = be for b, c e R, then b or c is a unit. 

An element p e R is said to be prime in R iff 

(i) p is a non-zero non-unit; 

(ii) whenever p\ab for a, b e R, then p\a or p\b. 

Example 6.1.2 (i) In the integral domain (Z, +, •), 6 is not an irreducible element, 
but 7 is an irreducible element. If p is a prime integer, then both p and —p are 
irreducible elements and prime (according to Definition 6.1.2). 

(ii) In (Z6, +, •) (which is not an integral domain), (2) is prime but not irre- 
ducible, since (2) = (2) • (4) and neither (2) nor (4) are units in Z$. 

In a principal ideal domain R , prime elements and irreducible elements are char- 
acterized by the corresponding prime ideals and maximal ideals of R , respectively. 

Theorem 6.1.2 Let p be a non-zero non-unit element in an integral domain R. 
Then 

(i) p is prime iff (p) is a non-zero prime ideal of R; 

(ii) p is irreducible iff (p) is maximal in the set S of all proper principal ideals 
ofR ; 

(iii) every prime element of R is irreducible ; 

(iv) if R is a P1D , then p is prime iff p is irreducible ; 

(v) every associate of an irreducible (prime) element of R is irreducible (prime); 

(vi) the only divisors of an irreducible element of R are its associates and the units 
ofR. 

Proof (i) As p is a non-zero non-unit element of R, (p) is non-zero and not R. Let 
p be prime in R. Then for b, c e R, if be e (p), then 3 an element r e R such that 
be = rp => p\bc =>> p\b or p\c => b e (p) or c e (p) =>- (p) is a prime ideal of R , 
such that (p) is non-zero. 

Conversely, let (p) be a non-zero prime ideal of R. Then p\bc (where b,c e R) 
=> be e (p) => b e (p) or c e (p) => p\b or p\c => p is prime in R. 

(ii) Let p be irreducible in R and (a) be a principal ideal of R such that 
(p) ^ (a) c R. Now p g (a) =y p = ra for some r e R => either r or a is invertible, 
since p is irreducible. Suppose r is invertible, then p = ra a = r~ l p e (p) =4> 
(a) c (/?)=>► a contradiction =>► a is an invertible =>► (a) = 7? =>► (/?) is a maximal 
principal ideal of 7?. Conversely, let (p) be a maximal principal ideal of R. If p is 
not irreducible, then p = be for some b, c e 7?, where neither Z? nor c is invertible 
(the possibility that p is a unit => (p) = R is not tenable). Now b e (p) =>► b = rp 
for some reR^p = bc= (rp)c = (rc)p =>► 1 = rc =>► c is invertible in 7? 
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a contradiction => b £ (p) => (p) C (b). Again (b) = R => b is a unit =>► a con- 
tradiction =>► (b) / R. Thus (/?) ^ (b) ^ R ^ a contradiction as (/?) is a maximal 
principal ideal of R. So, p must be an irreducible element of R. 

(iii) Let p be a prime element of R. Then p = ab for some a, b e R => p\a or 
p\b. Suppose p\a. Then px = a (for some x e R) => p = p(xb) => 1 = xb =>► b is 
a unit =>► p is irreducible. 

(iv) (implication part) follows from (iii). 

To prove <b= part, suppose p is an irreducible element such p\ab {a, b e R). Then 
ab = pc for some c e R. As R is a PID, (p, a) = ( d ) for some d e R => p = rd 
for some r e R => either r or J is an invertible element. Now d is invertible => 
{p, a) = R => 1 = tp + ua for some t,ueR=>b = bl = btp + bua = + = 

p(tb + uc) => p\b. For the other possibility, r is invertible in R ^ d = r~ l p e 
(p) (d) c (/?) a e {p} => p\a. Thus p\ab =>► p\a or p\b p is prime. 

(v) If p is irreducible and d is an associate of p, then p = du, where u is an 
invertible element in R (by Theorem 6.1.1 (vii)). Now d = ab => p = abu => a is a 
unit or bu is a unit. But bu is a unit =>► b is a unit. Consequently, d is irreducible. 

(vi) p is irreducible and a\p => (p) c (a) =>► (/?) = (a) or (a) = R by (ii) =>► a is 

either an associate of p or a unit. □ 

Remark The converse of (iii) is not true in an arbitrary integral domain (see Ex. 15 
of Exercises-I). 

We now introduce the concept of a greatest common divisor (gcd). 

Definition 6.1.3 Let <21 , 02, . . . , a n be elements (not all zero) of an integral domain 
R. An element d e R is said to be a greatest common divisor of a \ , 02 , . . . , a n iff 

(i) d\ai for i = 1, 2, . . . , n (d is a common divisor); 

(ii) c\at (for i = 1, 2 , . . . , n) => c\d. 

Suppose d and x e R satisfy conditions (i) and (ii). Then d\x and x\d =>► d and 
x are associates =>► d is unique (if it exists) up to arbitrary invertible factors. So we 
write any greatest common divisor of a \ , 02 , . . . , a n by gcd(a \ , 02 , . . . , a n ). 

Remark Given aq, 02, • • • , a n e R\ their gcd may not exist. For example, in R = 
Z[V5 /], the elements 2(1 + >/5/) and 6 have no gcd. 

Theorem 6.1.3 Let a\, <22, . . . , non-zero elements of an integral domain R. 
Then a\, 02, . . . , a n have a greatest common divisor d , expressible in the form: 

d — r\a\ + r2<22 H b r n a n (ri e R) iff' (a\ ,02,..., a n ) is a principal ideal. 

Proof Suppose d = gcd(<2i , <22, . . . , a w ) exists and is written in the form 


n 



i = 1 
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Then d g {a\ , (22, . . . , a n ) = A (say) =>• (d) c A. Now d|(2/ (for each i) =>• a/ = xt d 
(xi e R) => each element of A g (d) =>- A c (d). Consequently, A = (d). 

Conversely, let (a\ , (22, . . . , a n ) = A (say) be a principal ideal of R. Then A = (d) 
for some d e R. So each (2; g (d) =>► d|(2* for each i. We now show that for any 
common divisor c of a\ ’s, c\d. Now at = be, for suitable b g R and d = ^” =1 
ri e R => d = =>► c|d =>- d is a gcd(^i, (22, . . . , a n ), expressed in the 

desired form. □ 

Corollary Any finite set of non-zero elements a\, 02, . . . , a n of a principal ideal do- 
main R , has a greatest common divisor in the form: gcd((2i ,02,..., a n ) = Jf!i = 1 r i a i 
for suitable choices ofri g 7 C 

Definition 6.1.4 If gcd(ai , <22, . . . , (2 n ) = 1 fe e 7 ?), we say that a\, 02, . . . , a n are 
relatively prime. 

Corollary (Bezout’s Identity) Ifa\, 02 , . . . ,a n are non-zero elements of a principal 
ideal domain R , then a\ , 02 , . . . , a n are relatively prime iff Y^=\ r i a i — l>f or some 
n € R. 

We now study the concept of a least common multiple ( 1 cm) which is the dual to 
that of gcd. 

Definition 6.1.5 Let <21 , 02 , . . . , a n be non-zero elements of an integral domain R. 
An element d e R is said to be a least common multiple ( 1 cm) of a\ , 02 , . . . , a n iff 

(i) ai \d for i = 1 , 2 , . . . , n (d is a common multiple); 

(ii) <2/ |c (for i = 1 , 2 , . . . ,n) implies d|c. 

Remark An 1 cm (if it exists) is unique up to associates but 1 cm may not exist. For 
example, 1 cm of 2 (i + i) and 6 does not exist in Z [V 5 i]. 

We write lcm (a 1 , <22, . . . , a n ) to denote their any 1 cm (if it exists). 

Theorem 6.1.4 Let a\, a2 , . . . , a n be non-zero elements of an integral domain R. 
Then a\, <22, . . . , a n have a least common multiple iff the ideal p| {af) is a principal 
ideal in R. 

Proof Suppose d = \cm(a\, 02, ... , a n ) exists. Then d e ( at ) for each i =>► (d) c 
f>;>. Again r e f)(ai) =>- <2/ |r for each i =>► d|r =>- r g (d) =>► P| (<2/ ) c (d). So, 

f>/> = <d>. 

Conversely, p|((2*) * s a principal ideal of 7 ? =>> P|((2;) — (d) (for some d g 7 ?) =>- 
(d) c (a*) for each i => at\d for each i. Again for any other common multiple c of 
( 2 i, (22, , a n in R, (2/ 1 c for each i (c) c ((2/) for each i =>► (c) c P| (<2/ ) = (d) =>► 
d|c =>► d = lcm((2i, (22, . . . , a n ). □ 

Definition 6.1.6 A ring R is said to have the gcd property (lcm property), iff any 
finite number of non-zero elements of R admits a gcd (lcm). 
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Remark Every principal ideal domain satisfies both gcd property and 1cm property. 
This follows from Theorems 6.1.3 and 6.1.4. 

Definition 6.1.7 Let R be an integral domain and a be a non-zero non-invertible 
element of R. Then a is said to have a factorization in R iff a can be expressed as a 
finite product of irreducible elements of R . If every non-zero non-invertible element 
of R has a factorization in R then R is said to be & factorization domain. 

Definition 6.1.8 An integral domain R is a unique factorization domain iff the 
following two conditions hold: 

(i) Every non-zero non-invertible element of R can be factored into a finite product 
of irreducible elements; 

(ii) if a = p i p 2 • • • Pn and a = q\q 2 • • • q m (Pi, Qi are irreducible), then n = m and 
for some permutation o of {1, 2, . . . , n}\ pi and q a ( i) are associates for each /, 
in R. 

Example 6.1.3 Consider (Z, +, •)• By Fundamental Theorem of Arithmetic (Theo- 
rem 1.4.8 of Chap. 1), every non-zero non-unit element in Z is a product of a finite 
number of irreducible elements (prime integers or their negatives) and this factor- 
ization is unique (except for the order of the irreducible factors). Consequently, 
(Z, +, •) is a unique factorization domain. 

Theorem 6.1.5 Let R be a PID. Then every non-zero non-invertible element has a 
factorization into a finite product of primes. 

Proof Let a be a non-zero non-invertible element of R. Then 3 a prime p\ e R 
such that p\ | a (see Ex. 4 of Exercises-I) =>► a = p\a\ for some non-zero a\ e R => 
(a) c (a\). Suppose (a) = (a\). Now (a) = (a\) => a\ = ra for some r e R => a = 
p x a\ = p\ra =>► 1 = p\r => p\ is invertible => a contradiction => (a) / (a\) =>> 
{a) C (a\). Thus we obtain an increasing chain of principal ideals: 

(a>C (fl i)C( fl2 )C...C( % )C... (6.1) 

with a n - 1 = p n a n for some prime p n e R. This process is continued as long as a n is 
not an invertible element of R. But this chain terminates (see Ex. 2 of Exercises-I) 
and for some n , a n must have an inverse and then (a n ) = R. Thus (6.1) gives a = 
PiP2 • • • Pn-ip' n , where p' n = p n a n is prime (as it is an associate of a prime). □ 

Theorem 6.1.6 Every principal ideal domain R is a unique factorization domain. 

Proof Let a be a non-zero non-invertible element of R. Then a has a prime 
factorization by Theorem 6.1.5. To prove the unique factorization, suppose a = 
P\P2 • • • Pn = qiqi • • '<lm (n ^ m), where pi and qt are primes in R. Then pi \ 
(qiq2 ’ ’ ’ Qm)- Hence pi divides some qj (j = 1,2, ... ,m). Without loss of general- 
ity, suppose p\ \qi and n <m. Then qi = piui for some unit ui e R. Canceling pi, 
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we find that p2 • • • p n = u\q2 • • • q m - Continuing this process we find (after n steps) 
that 


1 — U\U 2 ’ ’ ’ U n C[ n -\-\ ••• q m . 

As qt are not invertible, we get a contradiction, resulting in m = n. Thus every pi 
has some qj as an associate and conversely. Hence the two prime factorizations are 
identical (up to order and associates of factors). □ 

Remark The converse of Theorem 6.1.6 is not true. Consider Z[x\. It is a uniquely 
factorizable domain but not a principal ideal domain (see Ex. 3 of SE-I). 


6.2 Euclidean Domains 

We now generalize Division Algorithm for integers to arbitrary integral domains by 
defining Euclidean domains. The ring of integers, Gaussian rings and polynomial 
rings are the main sources for the study of Euclidean domains with the help of norm 
functions. 

Definition 6.2.1 An integral domain R is said to be a Euclidean domain iff there 
exists a function 8 : R \ {0} — > N (non-negative integers) such that 

(i) 8(ab) > 8(a) (and also >8(b)) for any a,b e R with a 0 and b 0; 

(ii) (Division algorithm) for any a,b e R with b ^ 0, there exist q,r e R (quo- 
tient and remainder) such that a = qb + r, where either r = 0 or 8(r) < 8(b). 
8 is called the Euclidean valuation or Euclidean norm function or simply norm 
function. We also call the pair (R, 8) a Euclidean domain. 

We do not assign a value to 8(0). 

Example 6.2.1 (i) Every field E is a Euclidean domain with Euclidean norm func- 
tion defined by 

8(a) = 1 Wa(^0)eF 

and division algorithm is defined by a = (ab~ l )b + 0 Va e F and VZ? (^0) e F. 

(ii) (Z, +, •) is a Euclidean domain with Euclidean norm function 8 defined by 
8(a) = \a\,a (^0) e Z. Now 8(ab) = \ab\ = \a\\b\ = 8 (a) 8(b). 

Let a,b e Z and b ^ 0. Then 3 integers q and r (by Theorem 1.4.3 of Chap. 1) 
such that a = bq+r, where either r = 0 or 8 (r) < 8(b). So division algorithm holds. 

(iii) The ring D = Z [/] of Gaussian integers is a Euclidean domain with Eu- 
clidean norm function 8 defined by 8(m + in) = m 2 + n 2 Vm, n (not both zero) e Z. 
To define division algorithm let a, e D and /3 / 0. Then a /ft = a + ib, where 
a, b are rational numbers. Thus there exist integers mo, no such that |mo — a\ < \ 
and \no~b\ < 
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Hence a + ib = (mo + mo) + £i + i£2> where |£f| < ^ and \s 2 \ < ^ a = (mo + 
mo)/3 + (£1 + is 2 )P- Now a, P e D and mo + mo e D => a — (mo + inf)p gD=^ 
(ei + isi)P G D =4> a = qP + r, where r = (ei + isi)P and g = (mo + mo )• Now 
5(r) = «((ei + i£2))8) = \sj + sjm 2 < (± + ±)|/?| 2 = ^(0) < S(fi) ^S:D^ N 
is a Euclidean function. 

Problem 2 Let ( R , 5) be a Euclidean domain. Prove that 

(i) for each non-zero a e R, 8(a) > 8(1)-, 

(ii) if two non-zero elements a, b e R are associates, then 8(a) = 8(b); 

(iii) an element a (^0) e R is invertible iff 5(a) = 1. 

Solution (i) a = al =>► 5(a) >5(1); 

(ii) a and b are associates =>► a = Zm, for some invertible element u e R => b = 
aa _1 =4> 5(a) = 5(Zm) > 8(b) and 5(Z?) = 8(au~ l ) > 5(a) =4> 5(a) = 5(Z?); 

(iii) a (^0) has an inverse in 7? =>► a/? = 1, for some b e R => 8(a) < 8(ab) = 
5(1) < 5(a) =>► 5(a) = 1. Conversely, for the element a (^0) e R, 8(a) = l =^3q, 
r e R such that 1 = qa + r, where r = 0 or 5(r) < 5(a) =>► r = 0 or 5(r) < 1 

r = 0 (since the alternative is not possible) =>► 1 = qa =>► a is invertible in R. 

Problem 3 Prove that the quotient and remainder in condition (ii) of Defini- 
tion 6.2.1 are unique iff 5 (a + b) < max{5(a), 5 (Z?)} for any a,b e R. 

Solution Suppose there exist non-zero elements a,b e R such that 5 (a + b) > 
max{5(a), 5 (b)}. Then b = 0(a + b) + b = l(a + b) — a with both possibilities 
8(—a) = 5(a) < 5 (a + b) and 8(b) < 8 (a + b) shows non uniqueness of quotient 
and remainder in condition (ii) of Definition 6.2.1. Conversely, if the given condition 
holds and a e R and the element a e R has two representations, 

a = qb + r (r = 0 or 5 (r) < 5 (Z?)) , 
a = q'b + r' (r f = 0 or 5(r') < 8(b)). 

If r ± r' and q / q', then 8(b) < 8((q' — q)b) = 8(r — r') < max{5(r), 5 ( — r 7 )} < 
8(b), a contradiction. This shows that r = r' and q = q' by (A), as each of these 
relations implies the other =>► uniqueness of the representation. 

Remark Division algorithm for Z holds by defining 5(x) = \x |, for all v eZ\{0). 

Theorem 6.2.1 Every Euclidean domain is a principal ideal domain. 

Proof Let (R, 5) be a Euclidean domain and A an ideal of R such that A ^ { 0}. 
Consider the set S defined by S = {5(a) : a e A; a / 0} c N + . Then S ^ 0 and S 
has a least element by the well-ordering principle. Then 3 an element b e A such 
that 8(b) is least in S. We claim that A = (b). For any a e A, 3q,r e R such that 
a = qb + r, where r = 0 or 5(r) < 8(b) =>► 0 = a — qb e A (since A is an ideal) or 
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8(r) < 8(b) => a = qb e (b) (since the other alternative contradicts the minimality 
of 8(b)) => A c (b). Again b e A => (b) c A. Consequently, A = (b). □ 

Remark 1 Every ideal of a Euclidean ring R is of the form bR, where b e R. 

Remark 2 The converse of Theorem 6.2.1 is not true. For example, R = {a + 
23 : a, b e Z} forms a principal ideal domain under usual addition and multi- 
plication. But R is not a Euclidean domain. 

Corollary Every Euclidean domain is a unique factorization domain. 

Proof It follows from Theorems 6.2.1 and 6.1.6. □ 

Theorem 6.2.2 If F is a field , then F[x] is a Euclidean domain. 

Proof If F is a field, then F[x] is an integral domain by Proposition 4.5.1. For any 
f(x) e F[x], define 8 : F[x] \ {0} — > N (non-negative integers) by 

8(f) = degree of /. 

Then 8(f) is a non-negative integer such that d eg(/g) = deg / + deg g by Proposi- 
tion 4.5.2 and hence 

(and also ><$(#)), V/(x), g(x) e F[x] \ {0}. 

To verify the division algorithm, let f(x), g(x) e F[x] with g(x) 0. If deg / < 
degg, then / = 0g + /, with 8(f) < degg. Next assume that deg / > degg. We 
apply induction on deg / = n. If n — 0, then m — degg = 0 and hence there is 
nothing to prove. We assume that the division algorithm holds V/, g e F[x ] with 
deg f <n and deg / > degg. Let 


f(x) = ao + a\x + a 2 X 2 H 1- a n x n , a n 0 and 

g(x) = b{) + b\x + b 2 X 2 4 b / 0 

be polynomials in F[x] such that deg / = n >m = degg. 

Then h(x) = f(x) — a n b~ l x n ~ m g(x) is a polynomial in F[x] with the coeffi- 
cient of x n is a n — a n b~ l b m = 0. Hence deg/z < n — 1 =>► /z(v) = q(x)g(x) + r(x), 
with r = 0 or <$(r) < 8(g) by induction hypothesis =>► f(x) = a n b~ l x n ~ m g(x) + 
q(x)g(x) + r(x) = ^i(x)g(x) + r(x), where ^i(x) = a n b~ l x n ~ m + ^(x) g F[x] 
with r = 0 or 8(r) < 8(g). 

Hence F[x] is a Euclidean domain. □ 

Corollary 1 If F is a field , T/v] A a principal ideal domain. 


Proof It follows from Theorems 6.2.1 and 6.2.2. 


□ 
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Corollary 2 If F is afield , then F[x ] is a unique factorization domain. 

Proof It follows from Theorem 6.2.2 and Corollary of Theorem 6.2.1. □ 


6.3 Factorization of Polynomials over a UFD 

As the polynomial ring over an integral domain is also an integral domain, we can 
extend the factorization theory to such polynomial rings. In this section we study 
polynomials over unique factorization domains (UFD). We first give sufficient con- 
ditions for irreducibility of polynomials over a UFD, called the Eisenstein criterion. 
One of the most important applications of Eisenstein criterion is in the proof of the 
irreducibility of the cyclotomic polynomial fi p (x) in Q[x] (see Ex. 4, SE-I). The 
property that Z[x] is a UFD is generalized for R[x], where R[x] is an arbitrary 
UFD (see Theorem 6.3.2). This theorem is essentially due to Gauss and provides an 
extensive class of UFD’s. 

We start with the concept of primitive polynomials. 

Definition 6.3.1 Let R be a UFD and f(x) = ao + a\x H V a n x n e R[x]. Then 

the content of f(x ), denoted C(/) is defined by C(/) = gcd(^o, a\, . . . , a n ). 

For example, for f(x) = 4x 3 + 2x 2 + 3x + 7 e Z[x], C(f) = 1. 

Definition 6.3.2 Let R be a UFD. Then f(x) is said to be a primitive polynomial 
in R[x] iff C(/) = 1. 

Proposition 6.3.1 Let R he a UFD. The product of two primitive polynomials in 
7?[v] is also a primitive polynomial. 

Proof Left as an exercise. □ 

The concepts of irreducibility and units in any commutative ring R with 1 are 
closely related. In particular, it is necessary to look into the units of a polynomial 
ring R[x] for the study of irreducibility of polynomials in R[x]. The units in R[x] 
are precisely the non-zero constant polynomials which are units in R. We now study 
the irreducibility of polynomials over UFD. 

Proposition 6.3.2 

(i) If c is an irreducible element of a UFD R , then the constant polynomial c is 
also irreducible in /?[*]. 

(ii) Every one degree polynomial over a UFD R whose leading coefficient is a unit 
in R is irreducible in R[x]. In particular ; every one degree polynomial over a 
field is irreducible. 

(iii) Let f(x) be a polynomial of degree 2 or 3 over afield F . Then f(x) is irre- 
ducible over F iff f(x) has no root in F. 
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Proof Left as an exercise. □ 

Lemma 6.3.1 (Gauss Lemma) Let R be a UFD and F = Q(R ), the quotient field 
of R. If f{x) € 7?[x] is a non-constant irreducible polynomial in /?[x], then fix) is 
also irreducible in F[x]. 

Proof Let fix) be a non-constant irreducible polynomial in R[x]. If possible, let 
f{x) be reducible in F[x]. Then 3u(x), v(x) e F[x] such that fix) = uix)vix), 
where 0 < deg u < deg / and 0 < deg v < deg /. We may take uix) = ia/b)u\ix) 
and vix) = ic/d)v\ix), where a,b (^0),c, d (^0) e R and u\ix), tq(x) e /?[x] 
and are both primitive polynomials. Hence fix) = iac/bd)u\ix)v\ix), where 
u\ix)v\ix) is a primitive polynomial by Proposition 6.3.1. Now fix) e /?[x] is 
irreducible and C(/) divides / => C(/) = 1 =>► / is primitive. Consequently, 
ibd)fix) = iac)u\ix)v\ix) => bd = ac (comparing the content). Hence fix) = 
u\ix)v\ (x) in /?[x], where deg u\ = deg u < deg / and deg v\ = deg v < deg /. But 
this contradicts the assumption that fix) is irreducible in R[x]. This leads to the 
conclusion that fix) is also irreducible in F[x]. □ 

Theorem 6.3.1 (The Eisenstein criterion) Let R be a UFD with quotient field QiR) 
and fix) = ao + a\x H b a n x n € ^[v], a n =fi 0 i.e., fix) be a non- constant poly- 

nomial in 7?[v] of degree n. If there exists an irreducible element p e R such that 

(i) p\at, i = 0, 1, 2, . . . , n — 1; 

(ii) p]a n and 

(iii) P 2 \ao, 

then fix) is irreducible in QiR)[x]. 

Moreover ; if fix) is a primitive polynomial in 7?[v], then fix) is also irreducible 
in 7?[x]. 

Proof First suppose that fix) is a primitive polynomial in 7?[x] satisfying the 
above divisibility conditions with respect to p. If possible, let fix) be reducible 
in R[x]. Then 3g(x), hix) e 7?[x] such that fix) = g(x)/z(x), where degg < n and 
deg/z < n. 

Let 


g(x) = go + 8 ix H b g r * r , where g r / 0, and 0 <r <n 

and 

hix) = ho + h\x H b h t x l , where h t 0, and 0 < t < n. 

Hence 

fix) = gix)hix) => a t = giho + gi-ihi H bgo*/. 

Now |go or /?|/zo, but not both, otherwise, p 2 \ao, which is a contradic- 

tion. Suppose p\go but p]ho. Clearly, p does not divide C(g), otherwise all the 
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coefficients of fix) would be divisible by p , which is not true, since fix) is a 
primitive polynomial in R[x]. Let g s be the first of the g z -’s such that g s is not di- 
visible by p , where s <r <n. Then p divides all the coefficients go, gi, • • • , gs-l • 

But a s = g s ho + g s -\h\ H f- goh s and p\a s , p\g s -i, . . . , p\go => either p\g s or 

p\ho =>> a contradiction in either case. Hence /(x) is irreducible in R[x]. 

Next we suppose /(x) = C(/)/i(x) with /i(x) primitive in 7?[x]; (in particular 

f\ = / if / is primitive in 7?[x]). If /i(x) = bo + &ix H f- ^x' 2 and C(/) = d , 

then a z = for i = 0,1,2, ... ,n. Since /? does not divide p cannot divide d. 
Thus p divides bi, i = 0, 1, 2 , ...,n — 1, p does not divide b n and p 2 does not 
divide bo. Hence by the above argument, /i(x) is irreducible in R[x] and thus it 
is irreducible in Q(R)[x] by Lemma 6 . 3 . 1 . As d is non-zero element in the field 
Q(R ), d is a unit in Q(R)[x], showing that /(x) is irreducible in Q(R)[x]. □ 

Corollary 1 If p is a prime integer and n (>1) is an integer ; then x n — p e Z[x] is 
irreducible in Z[x]. 

Remark The Eisenstein criterion gives only a sufficient condition for irreducibility 
but not necessary (see Ex. 4 of SE-I). 

Corollary 2 (The Eisenstein criterion for Z[x]) Let f(x) = ao+a\x-\ Ya n x n e 

Z[x] , a n 0 be a primitive polynomial in Z[x]. If there is a prime integer p such 
that 

(i) p\at, i = 0, 1, 2, . . . , n — 1; 

(ii) p]a n and 

(iii) P 2 \ao, 

then f{x) is irreducible in Z[x]. 

Proof It follows from Theorem 6 . 3.1 by taking R = Z. □ 

Theorem 6.3.2 (Gauss Theorem) If R is a UFD , then /?[x] is also a UFD. 

Proof Let /(x) e R[x ] and F = Q(R ), the quotient field of 7?. Then F[x] is 
a UFD by Corollary 2 of Theorem 6 . 2 . 2 . Suppose fix) = af\(x ), where <2 = 
C(/) and /i g 7?[x] is a primitive polynomial. Now F[x] is a UFD =>► /i(x) = 
where gi(x) e F[x] are irreducible. We may take gi(x) = 
(i at /bi)hiix), where a z , bi (^0) g 7? and hi (x) G 7?[x] are primitive. 

Hence 


/l (^) = (a 1^2 • • • a n )/(bib 2 ■ ■ ■ b n )h\(x)h 2 (x) ■■■h n (x) 

=> (bib 2 ■ ■■b n )f\(x) = (aia 2 • • ■ a n )h\{x)h 2 (x) ■ • • h n (x ) 

=>• /i(x) = hi(x)h 2 (x) ■■■h n (x), 

since /? i • • • b„ = a\a 2 ■ ■ ■ a n (comparing the contents), where hi(x) e ft[.r] are 
primitive and irreducible in F \ x \ and hence irreducible in 7?[x] also. 
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Let C (/) = a = a\ci 2 • • • a t , where <Zj g R are irreducible. Then /(jc) = af\ (jc) = 
(«1«2 • • • a t )h\{x)h 2 (x) • • • h n (x) is a factorization of fix) as a product of irre- 
ducible elements of R[x]. 

Let 


f(x) = (< aia 2 ---a t )hi(x)h 2 (x)---h n (x ) 

= (b\b2 • ■■WiWfeW ---kmix) 

be two factorizations of /(jc) as a product of irreducible elements of R[x]. Com- 
paring the contents and using the Gaussian Lemma, the uniqueness of the above 
factorizations follows. □ 

Corollary 1 Z[x] is a UFD. 

Corollary 2 If R is a UFD , then R[x\, X 2 , - - - x n ] is also a UFD. 

Proof It follows from Gauss Theorem 6.3.2 by inductive argument using the result 
that /?[vi , X2 5 • • • , Xfi ] R [x\ , X2 5 • • • 5 Xyi — J ] \Xfl \ • t ! 

We now present an important application of irreducibility of polynomials over a 
UFD in construction of finite fields. 

Theorem 6.3.3 Given a prime integer p and a positive integer n, there exists a finite 
field of order p n . 

Proof Consider the polynomial ring Z p [x\ and an irreducible polynomial f(x) of 
degree n over Z p . As Z p is a field, Z p is PID and (f(x)} is a maximal ideal of 
Z p [x]. Thus, Z p [x]/{f(x)) is afield. As deg/(x) = n , an element of Z p [x]/{f(x)} 
is of the form 

ao + a\x + a 2 X 2 + f- a n -\x n ~ l , 

where ai g Z p , i = 0, 1, . . . , n — 1. 

Thus, \Z p [x]/(f(x))\ = p n . □ 


6.4 Supplementary Examples (SE-I) 

1 If is a prime integer, then ^fp is an irrational number for every positive inte- 
ger n > 1 . 

[ Hint. x n —p G Z[x] is irreducible in Z[v] by Corollary 1 of Theorem 6.3.1. Then 
by the Gauss Lemma it is also irreducible in Q[x]. Hence it has no rational root.] 

2 Let f{x) =ao + a\x + • • • + a n x n G Z[x], a n / 0. If there is a prime integer p 
such that 
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(i) p\cii, i = 0, 1, 2, . . . , n — 1; 

(ii) p]a n , and 

(iii) P 2 \ao, 

then /(x) is irreducible in Q[x]. 

[Hint. Use Theorem 6.3.1.] 

3 Z[x] is a UFD but not a PID. 

Z[x] is a UFD by the Corollary to Theorem 6.3.2. To show that Z[x] is 

not a PID, let I = {/ = ao + a\x H b £ Z[x] • = even}. Then I is not a 

principal ideal, otherwise I = (h(x)) for some h(x) g Z[x]. As 2 g /, /z must divide 

2 and hence h = ±2. But x + 2 g / and h does not divide x + 2. Hence Z[x] cannot 
be a PID.] 

4 For any prime integer p > 1, the cyclotomic polynomial 0 /? (x) = l+ x+ x 2 + 
h x p ~ l is irreducible in Q[x]. 

[Hint. (p p (x) is irreducible in Z[x] but there does not exist any prime integer 
satisfying the Eisenstein criterion for irreducibility of <p p (x), though it is irreducible. 
If 4>p is not irreducible in Z[x], then 3 non-trivial polynomials /(x), g(x) G Z[x] 
such that <p p {x) = /(x)g(x). Hence <p p {x + 1) = /(x + 1 )g(x + 1) is a non-trivial 
factorization of (p p {x + 1) in Z[x]. On the other hand, 

(p p (x) = Y => Mx + 1) = p 4 F (r ) xr_1 ^ ^ px^ + x^- 1 

is irreducible in Z[x] by Theorem 6.3.1. Hence <p p (x) is irreducible in Z[x] 

(p p (x) is also irreducible in Q[x] by Gauss Lemma.] 

5 If R is a PID, then R[x] is not necessarily so. If R[x] is a PID, then R is a field. 

Z is a PID but Z[x] is not so by Ex. 3. Suppose R[x] is a PID. Consider the 
map / : R[x] — >► P defined by f(Y^l = Q a i x i) = ^o- Then / is a ring epimorphism 
with ker / = (x) (see W.O. Ex. 7 of Chap. 5). 

Hence R[x]/(x) = R => (x) is a prime ideal in P[x] =>► (x) is a maximal ideal in 
R[x] by Theorem 5.3.5 of Chap. 5. => R is a field by Theorem 5.3.2.] 

6 The ideal (x 2 + 1) is a maximal ideal in R[x]. 

[Hint. If possible, let 3 an ideal I in R[x] such that (x 2 + 1) C / c R[x]. Then 

3 f{x) G I such that /(x) ^ (x 2 + 1). By division algorithm, /(x) = (x 2 + l)#(x) + 
r(x), where r(x) ^ 0 and deg(r(x)) < 2. Let r(x) = ax + b, a, b e R and they are 
not both 0. Hence ax + b = /(x) — (x 2 + l)^(x) g I =>> ( ax + b)(ax — b) g / 
a 2 x 2 — b 2 e I. Moreover, a 2 (x 2 + 1) G I. 

Hence a 2 + b 2 G / / contains a non-zero real number => / = R[x].] 
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6.5 Exercises 

Exercises-I R denotes an integral domain and R' = R \ { 0 } (unless specified 
otherwise). 

1 . Show that the binary relation p on R defined by apb a is an associate of b , 
is an equivalence relation with equivalence classes which are sets of associated 
elements, and the associates of the identity are precisely the invertible elements 
of R. Find the associates of an integer n in (Z, +, •)• 

2 . Let R be a PID. If {A n } is an infinite sequence of ideals satisfying: 

A\ A .2 ^2 * ■ ■ A n Cj A n - |_i ^ . 

Show that there exists an integer m such that A n = A m Wn > m. 

[Hint. Consider A = (J A n . Then A is a principal ideal. Now A = (a) => 
a e A m (for some m) =>► Wn > m, A = (a) c A m c A n ^ A ^ A n = A m .] 

3 . Let R be a PID. Show that the non-trivial ideal (p) is a maximal (prime) ideal 
of R p is an irreducible (prime) element of R. 

[Hint. Use Theorem 6 . 1 . 2 (ii) for the first assertion. For the second assertion, 
let p be a prime element of R and ab e ( p ). Then ab = rp for some r e R => 
p\ab =>> p\a or p\b =>> a e (p) or be (p) => (p) is a prime ideal of R. The 
converse is similar.] 

4 . Let R be a PID and a (^ 0 ) be non-unit (i.e., non-invertible) in R. Show that 
there exists a prime p e R such that p\a. 

[Hint, a (^ 0 ) is a non-unit in R =>► (a) is a proper ideal of R (see Remark 2 
after Theorem 6 . 1 . 1 ) =>► 3 a maximal ideal M of R such that (a) c M (by the 
Corollary of Theorem 5 . 3.6 of Chap. 5 ) =>► (a) c M = (p) for some prime 
element p e R => p\a.] 

5 . Let a , b, c be non-zero elements of a principal ideal domain R. If c\ab and 
gcd (a, c) = 1 , then show that c\b. 

[Hint, gcd (a, c) = 1 => 1 = ra + tc (for some r, t e R) =>► b = lb = rab + 
tcb. Now c\ab and c\c ==> c|(r<2Z? + £cZ?) =>► c|Z?]. 

6 . Let a\, <22, . . . , a n and r be non-zero elements of an integral domain R. Prove 
that 

(a) if lcm(<zi, <Z 2 » • • • , a n ) exists, then lcm(r<2i, ra2, ... , ra n ) also exists and 
lcm(r<zi, ra2 , . . . , ra„) = r lcm(<zi, <22, ... , a n ); 

(b) if gcd(rai, m2 , . . . , ra n ) exists, then gcd(<2i, <22 , . . . , a n ) also exists and 
gcd(rai,r^2, ...,ra n ) = r gcdfaq, <22, • . ., 0 /i). 

7 . Let a\, <22, . . . , a n and Zq , Z?2, . . . , b n be non-zero elements of an integral domain 
R such that flqZq = <22^2 = • • • = <2 n Z^ = v; show that 

(a) if lcm(<2i, < 22 , • • • , an) exists, then gcd(Zq, Z?2, . . . , b n ) also exists and satis- 
fies: lcm(<2i, (22, . . . , <2 n )gcd(Zq, Z?2, • • • , Z? n ) = x; 

(b) if gcd(r<2i , ra2, . . . , ra;, . . . , ra n ) exists Vr (^ 0 ) e R, then lcm(Zq , Zq, . . . , 
b n ) also exists and satisfies: gcd(a\, <22 , . . . , a n ) lcm(Zq, b2, ... , b n ) = x. 
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8. Prove that an integral domain R has the gcd property R has the 1cm property. 

9. In a principal ideal domain R , prove that every non-trivial ideal is the product 
of a finite number of prime (maximal) ideals. 

[Hint, a e R' is non-invertible =>► a = p\p2 • • • where each pi is a prime 
element of R => (a) = {p\p2 • • • p n ) = (pi)(P2) • * * (Pn), where each (pi) is a 
prime ideal of R.] 

10. Prove the Euclid Theorem : There are an infinite number of primes in Z. 

[Hint. Suppose there are only a finite number of primes, p \ , p 2 , . . . , p n (say). 
Then x = (p\P2 • • • /?«) + 1 is an integer > 1 such that x is not divisible by any of 
these primes. But Theorem 6. 1 .5 =>► x must have a prime factor =>► x is divisible 
by a prime different from the above primes => 3 an infinite number of primes 
in Z.\ 

11. Let F be a field and F [x] the polynomial extension of F and f,geF[x] such 
that g ^ 0. Show that there exist unique polynomials q,r e F[x] such that 
/ = gq + r , where either r = 0 or deg r < deg g. 

12. Let F be a field. Show that F[x] is a Euclidean domain with a Euclidean norm 
function 8 such that 8 satisfies the additional condition: 


Hf + g)< max(5(/),5(g)). 
[//m£. Define 8 : F[x] \ {0} -> N by 

8(f) = 2 deg/ . 


Then 


Hf + g) = 2 deg(/+g) < 2 max ( de s/’ de g?) (by Proposition 4.5.2) 

= max(2 deg/ \ 2 deg g ) = max(5(/), 5(g)).] 

13. Let F be a Euclidean domain with norm function 8 satisfying the additional 
condition 8 (a + b) < ma x(8(a), 8(b)). Prove that either F is a field K or F c 
^[x] for some field K. 

[Hint. Use the fact a Euclidean domain contains an identity e. 

Then 0 / 5(e) = 5(e 2 ) = 5(e)5(e) 5(e) = 1. Define K = {a e F : 

8(a) < 1}. Now a,b e K => 8 (a — b) < ma x(8(a),8(b)) < 1 and 8(ab) = 
8(a)8(b) < 1 =>► /T is a subring of F.] 

14. Let (/?, 8) be a Euclidean domain and A any ideal of R. Show that there exists 
an element ao e A such that A = (ao). 

[Hint. Let A / {0} and 5 = |<$(x) : x e A\{0}} c N + =>► 5 has least ele- 
ment mo > 0 and let ao e A\{0} be such that 8(ao) = mo . For v e A, 3q, r e R 
such that v = aoq + r, where 0 < 8(r) < 8(ao) => r = 0 (proceed as in Theo- 
rem 6.2.1) =^Ac (a 0 ).] 

15. Show that the quadratic domain Z[V^5] = {a + b^/^5 : a, b e Z} with usual 
addition and multiplication of complex numbers does not admit unique factor- 
ization. Also show that 3 is irreducible but not prime in Z[>/^5]. 
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[Hint. 2, 1 + a/^5, 1 — a/^ 5 are irreducible elements in Z[V^5]. Now 
3|6 => 3| (1 + V=5)(l - 2=5) in Z[V=5]. But 3|(1 + V=5) or 3|(1 - 
7=5) ^3 is not prime.] 

16. ( The division algorithm in polynomial ring.) Let R be a commutative ring with 
identity and /, g e R[x] non-zero polynomials such that the leading coefficient 
of g is an invertible element. Show that there exist unique polynomials q , r e 
R[x] such that / = qg + r, where either r = 0 or degr < degg. 

17. ( Remainder Theorem.) Let R be a commutative ring with identity. If f(x) e 
R[x] and a e R, show that there exists a unique polynomial q(x) in R[x] such 
that f{x) = (x — a)q(x) + f (a). 

18. An element a e R is said to be a root of a polynomial f(x) e R[x] iff f (a) = 0. 
Show that f(x) is divisible by v — a a is a root of /(v). 

19. Let 7? be an integral domain and f(x) e /?[v] a non-zero polynomial of de- 
gree n. Show that /(v) can have at most n distinct roots. 

20. Let R be an integral domain and f(x),g(x) e 7?[v] be two non-zero poly- 
nomials of degree n. If there exist n + 1 distinct elements ai e R such that 
f(at) = g(ai) for i = 1 , 2, . . . , n + 1 , show that f(x) = g(x). 

21. Let R be an integral domain and A be any infinite subset of R. If f(a) = 0 
Vfl G A, show that / = 0. 

22. (a) Let R be a commutative ring with 1. Show that the following statements are 

equivalent: 

(i) R is a field; 

(ii) R[x] is a Euclidean domain; 

(iii) R[x] is a principal ideal domain. 

(b) Using (a) show that Z[x] is not a principal ideal domain. 

23. Show that Z [^/n\ is a Euclidean domain forn = — 1, —2, 2, 3, where = 

{a + : a,b e Z}. 

24. (a) Let D be the integral domain given by D = Z[i a/3] = {a + bi a/ 3 : a , b e Z} 

and u : Z) -> N be the map defined by v(a + bi*J 3) = a 2 + 3 b 2 . Show 
that u is not a Euclidean valuation on D. 

(b) If a 2 + 3b 2 is a prime integer, show that a + bi \/3 is an irreducible element 
in Z[/x/3]- 

25. Let R be a UFD with quotient field F . Show that 

(a) two primitive polynomials in R[x] are associates in R[x] iff they are asso- 
ciates in F[x]\ 

(b) a primitive polynomial of positive degree in R[x] is irreducible in R[x] iff 
it is irreducible in F[x]. 

26. Identify the correct alternative(s) (there may be more than one) from the fol- 
lowing list: 

(a) Let I be the ideal generated by x 2 + 1 and J be the ideal generated by 
v 3 — x 2 + v — 1 in Q[v]. If R = Q[x]/I and S = Q [x]/J, then 

(i) R and S are both fields; 

(ii) R is an integral domain, but S is not so; 
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(iii) R is a field but S is not so; 

(iv) R and S are not integral domains. 

(b) Which of the following integral domains are Euclidean domains? 

(i) R[x 2 , x 3 ] = {/ = £"=o chx 1 e R[x] : a x = 0}; 

(ii) Z[V=3] = {a + bsT 1 3 : a, b e Z}; 

(iii) Z[x]; 

(iv) (^fy lj])> where x, y are independent variables and (2, x) is the ideal 
generated by 2 and x . 

(c) Let 7 be the ideal generated by 1 + x 2 in Q[x] and R be the ring given by 
R = Q[x]/7. If y is the coset of x in R, then 

(i) y 2 + 1 is irreducible over R ; 

(ii) y 2 — y + lis irreducible over R ; 

(iii) y 3 + y 2 + y + 1 is irreducible over /?; 

(iv) y 2 + y + 1 is irreducible over R. 

(d) Let f n (x ) = x n ~ l + x n ~ 2 + 1- x + 1 be a polynomial over Q of degree 

n > 1 . Then 

(i) f n (x) is an irreducible polynomial in Q[x] for every positive integer n ; 

(ii) f p e(x) is an irreducible polynomial in Q[x] for every prime integer p 
and every positive integer e\ 

(iii) f p {x) is an irreducible polynomial in Q[x] for every prime integer p\ 

c — 1 

(iv) f p (x p ) is an irreducible polynomial in Q[x] for every prime inte- 
ger p and every prime integer e. 

(e) Consider the element a = 3 + \[— 5 in the ring R — Z[V— 5] = {a + 

5 :a,b e ZJ. Then 

(i) R is an integral domain; 

(ii) a is irreducible; 

(iii) a is prime; 

(iv) R is not a unique factorization domain. 

(f) Let F p be the field Z p , where p is a prime. Let /^[x] be the associated 
polynomial ring. Then 

(i) F^\x]/< x 2 + x + 1 > is a field; 

(ii) 7*2 [x ] / <x 3 +x + l>isa field; 

(iii) 7*3 [x ] / <x 3 +x + l>isa field; 

(iv) 7*7 [x]/ < x 2 + 1 > is a field. 

(g) Which of the following statement(s) is (are) false? 

(i) A homomorphic image of a ULD (unique factorization domain) is 
again a ULD; 

(ii) units of the ring Z[>/— 5] are the units of Z; 

(iii) the element 2 e Z[a/^ 5] is irreducible in Z[>/— 5]; 

(iv) the element 2 is a prime element in Z[a/— 5]. 
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(h) Let R be the polynomial ring Z 2 L*] and write the elements of Z 2 as {0, 1}. 
If f{x) = x 2 + x + 1, then the quotient ring R/(f(x )) is 

(i) a ring but not an integral domain; 

(ii) a finite field of order 4; 

(iii) an integral domain but not a field; 

(iv) an infinite field. 

(i) Which of the following ideal(s) is(are) maximal? 

(i) The ideal 17Z in Z; 

(ii) the ideal / = {/:/( 0 ) = 0 } in the ring C([ 0 , 1 ]) of all continuous real 
valued functions on the interval [ 0 , 1 ]; 

(iii) the ideal 25Z in Z; 

(iv) the ideal generated by v 3 — v + 1 in the ring of polynomials F 3 [x], 
where F 3 is the field of three elements. 


6.6 Additional Reading 

We refer the reader to the books (Adhikari and Adhikari 2003, 2004; Artin 1991; 
Atiya and Macdonald 1969; Birkoff and Mac Lane 1965; Burtan 1968; Fraleigh 
1982; Herstein 1964; Hungerford 1974; Jacobson 1974, 1980; Lang 1965; McCoy 
1964; van der Waerden 1970; Zariski and Samuel 1958, 1960) for further details. 
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Chapter 7 

Rings with Chain Conditions 


In this chapter we continue to study theory of rings by finiteness conditions (chain 
conditions) on ideals. The motivation of chain conditions for ideals of a ring came 
from an interesting property of Z. In the ring of integers Z, let I\ be a non-zero 
ideal and I be an ideal such that I\ c I. Then 3 non-zero integers m and n such 
that 1 1 = (m), I = (n) and n\m , as Z is a PID. Since m contains only a finitely 
many divisors n in Z such that I\ c /, there cannot exist infinitely many ideals I t 
(t = 2, 3, . . .) such that I\ ^ I 2 ^ h § ■ ■ ■ § h ^ • • • . This interesting property of Z 
was first recognized by the German mathematician Emmy Noether (1882-1935). On 
the other hand, Z has an infinite descending chain of ideals (3) ^ (6) ^ (12) ^ . 

This property of Z leads to the concept of Artinian rings, the name is in honor 
of Emil Artin (1898-1962). In this chapter we study special classes of rings and 
obtain deeper results on ideal theory. For this we need to impose some finiteness 
conditions and introduce Noetherian rings which are versatile. The most conve- 
nient equivalent formulation of the Noetherian requirement is that the ideals of the 
ring satisfy the ascending chain condition. We establish a connection between a 
Noetherian domain and a factorization domain. We prove the Hilbert Basis Theo- 
rem which gives an extensive class of Noetherian rings. Its application to algebraic 
geometry is discussed. We also prove Cohen’s theorem. Our study culminates in 
rings with descending chain condition for ideals, which determines their ideal struc- 
ture. 

In this chapter a ring R means a commutative ring with identity, unless otherwise 

stated. 


7.1 Noetherian and Artinian Rings 

The basic concepts of modern theory of rings arose through the work of Enmy 
Noether and Emil Artin during 1920s. 

Noetherian and Artinian rings form a special class of rings. In this section we 
study such rings with the help of their ideals and determine their ideal structures. 
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This study develops ring theory. In this section we prove Hilbert Basis Theorem 
which has made a revolution in modem algebra and algebraic geometry. We further 
show that any Noetherian domain is a factorization domain and any Artinian domain 
is a field. 

Definition 7.1.1 A ring R is said to satisfy the ascending (< descending ) chain condi- 
tion denoted by acc (dec) for ideals iff given any sequence of ideals A, I2 , . . . of R 
with /1 2 /2 c • • • c / n c • • (/j D / 2 D • • o /„ D • • •), there exists an integer n 
(depending on the sequence) such that I m = I n Vm > n. 

For non-commutative rings, the definitions of acc (dec) for left ideals or for right 
ideals are similar. 

Definition 7.1.2 A ring R is said to be & Noetherian ring iff it satisfies the ascending 
chain condition for ideals of R. (This name is in honor of Emmy Noether.) 

For non-commutative rings the definition of a left (right) Noetherian ring is sim- 
ilar. 

Definition 7.1.3 A ring R is said to satisfy the maximal condition (for ideals) iff 
every non-empty set of ideals of R , partially ordered by inclusion, has a maximal 
element (i.e., an ideal which is not properly contained in any other ideal of the set). 

Theorem 7.1.1 Let R be a ring. Then the following statements are equivalent'. 

(i) R is Noetherian. 

(ii) The maximal condition (for ideals ) holds in R . 

(iii) Every ideal of R is finitely generated. 

Proof (i) =>► (ii) Let F be any non-empty collection of ideals of R and I\ e F. If 
1 1 is not a maximal element, 3 an element fteF such that I\ ^ / 2 . Again, if I2 is 
not a maximal element, then I2 ^ I3 for some I3 e F. If F has no maximal element, 
then continuing the above process, we obtain the infinite strictly ascending chain of 
ideals of R : 


I iC/ 2 C/ 3 C.... 

But this contradicts the assumption that R is Noetherian. 

(ii) =>. (ifi) Let I be an ideal of R and F = {A : A is an ideal of R; A is finitely 
generated and Ac/}. 

{0} c I =>- {0} e F =>► F / 0. Then (ii) =>► F has a maximal element, say M. 
We show that M — l. Suppose M I. Then 3 an element a e I such that 
a <£ M. Now M is finitely generated. Suppose M — (a\, 02, . . . , a t ). Then A = 
(a 1 , a2, ..., a t , a) eF M ^ A a contradiction (since M is a maximal element 
in F) => M = I => I is finitely generated. 

(iii) =>. (i) Let I\ c I 2 c / 3 c • • • be an ascending chain of ideals of R. 
Then I = IJ^li is an ideal of R , hence is finitely generated; suppose that 
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I = (a\, < 22 , . . . , a r ). Now each generator a t e some ideal f t of the given chain. 
Let n be the maximum of the indices i t . 

Then each a t e I n . Consequently, for m > n, 

I = (a\, Cl2, • • • , dr) In ^ Im I hn = ft 

=> the given chain of ideals is stationary at some point =>► R is Noetherian. □ 

Definition 7.1.4 An integral domain (ring) that satisfies any one of the equivalent 
conditions of Theorem 7.1.1 is called a Noetherian domain (ring). 

We now show that a Noetherian domain is a natural generalization of a PID. 

Proposition 7.1.1 Every principal ideal domain is a Noetherian domain. 

Proof Let D be a principal ideal domain and I\ c I 2 C • • • c I n C • • • be an as- 
cending chain of ideals of D. Let I = |J r I r . Then I is an ideal of D. Now I = (a) 
for some a e I =>► a e I t of the given chain for some t =>► I = (a) c I t =>► I c I t c 
I m c / Vm >t=>I m = I t Vm > t. □ 


Most of the rings which we have been studying are Noetherian. 

Example 7. LI (i) (Z, +, •) is a Noetherian ring by Proposition 7.1.1. 

(ii) Every field is a Noetherian ring. 

(iii) Every finite ring is a Noetherian ring. 

(iv) If E is a field, then the integral domain F[x\, X 2 , . . .] (in infinite indeter- 

minates x\, X2, . . .) is not Noetherian as it contains the infinite ascending chain of 
ideals ( jci ) C (x\,X 2 ) ^ (xi, X 2 , xf) ^ . But E[vi, JC 2 , . . . , x n ] (in finite number 

of indeterminates x\, X 2 , . . . , x n ) is Noetherian by Corollary 3 of Theorem 7.1.2. 

We now study Hilbert Basis Theorem which provides an extensive class of 
Noetherian rings. David Hilbert (1863-1941) asserted the Theorem in 1890. This 
theorem revolutioned algebraic geometry. The word ‘basis’ is used in the sense of 
finite generation. 

Theorem 7.1.2 (Hilbert Basis Theorem) If R is a Noetherian ring with identity , 
then /?[x] is also a Noetherian ring. 

Proof Let I be an arbitrary ideal of R[x]. To prove this theorem it is sufficient to 
show that I is finitely generated. For each integer t > 0, define the set I t = {r e R : 
a§ + a\x + • • • + rx l e 1} U {0} (i.e., consisting of zero and those r e R such that 
r appear as the leading non-zero coefficient of some polynomial in I of degree t). 
Then I t is an ideal of R such that I t C I t+ i Vt > 0 => /o c c / 2 c • • • . Since R is 
a Noetherian ring, 3 an integer n such that Ik = I n V& > n. Also each ideal f of R 
is finitely generated and suppose f = (an, at 2 , . . . , ai mi ) Vi = 0,1,2, ... ,n, where 
aij is the leading coefficient of a polynomial fj el, of degree i. 
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We claim that I is generated by the mo H \~m n polynomials /oi , . . . , /om 0 • • • » 

fnh • • • > fnm n ' Let 7 = (/oi, • • • , fomo* • • • > fnh fnh • • • » fnm n )- Now each f j G / 
(by our choice) implies 

7 c /. (7.1) 

Let / (/0) g /?[x] be such that / e / and of degree t (say): / = bo + Z?iv H h 

b t ~ \x l ~ l + bx l . We now apply induction on t. For t = 0, / = bo g Io ^ 7 • Next 
assume that any polynomial e / of degree less than or equal to t — 1 also belongs 
to 7. 

Case 1. £ > ft =>► the leading coefficient b (of f)el t = I n (as I t = 4 V/ > «) 

Z? = + a n 2C2 H h a nm n c m n for suitable a G =>► g = / - (/„ici + /„2C2 + 

• • • + fnm n Cm n )x t ~ n G / having degree <t (since the coefficient of x r in g is b — 
i a niCi = 0) => g G 7 by induction hypothesis => f G 7. 

Case 2. t<n^be I t ^b = a t \d\ + ^ 2^2 + • • • + a tmt d mt for some di, d 2 , 
...,dm t eR=>h = f- (ftidi + fodi + • • • + ftm,d mt ) e / having degree <7 (since 
the coefficient of in 4 is b — Yl7=i ati ^ — 0) => h G 7 by induction hypothesis 
=> f €J- 

Consequently, in either case I c 7 and hence / = 7 by (7.1). Thus / is finitely 
generated and hence 7?[v] is Noetherian. □ 

By induction, Hilbert Basis Theorem can be extended to polynomials in several 
indeterminates: 

Corollary 1 If R is a Noetherian ring with identity , then the polynomial ring 
R[x\, X 2 , . . • , x n ] in a finite number of indeterminates x\, . . . , x n is also a 

Noetherian ring. 

Corollary 2 Z[x \ , jc 2 , . . . , x n ] A a Noetherian ring. 

Corollary 3 F[x \ , . . . , x n ] is a Noetherian ring for every field F . 

A Noetherian ring need not satisfy dec on ideals. We now consider rings which 
satisfy dec on ideals. 

Definition 7.1.5 A ring R is said to be an Artinian ring iff it satisfies the descending 
chain condition for ideals of R. (This ring is named after Emil Artin (1898-1962).) 

Definition 7.1.6 A ring R is said to satisfy the minimal condition (for ideals) iff 
every non-empty set of ideals of R , partially ordered by inclusion, has a minimal 
element. 

Theorem 7.1.3 Let R be a ring. Then R is Artinian iff R satisfies the minimal con- 
dition (for ideals). 
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Proof Let R be Artinian and F a non-empty set of ideals of R. Let I\ e F. If I\ 
is not a minimal element, we can find another ideal h e F such that I\ ^ h. If F 
has no minimal element, the repetition of this process indefinitely yields the infinite 
strictly descending chain: I\ ^ I 2 ^ I 3 ^ ■ ■ • of ideals of R => a contradiction to 
the fact that R is Artinian => F satisfies the minimal condition. Conversely, suppose 
R satisfies the minimal condition. Let /1 3 ft 2 h 2 • • • be a descending chain 
of ideals of R. Consider F = {I t : t = 1, 2, 3, . . .}. Then A e F F / 0. Again 
by hypothesis, F has a minimal element I n for some positive integer n =>► I m c I n 
Vm > ft. Now I m ^ I n =>► I m £ F (by the minimality of /„) => an impossibility 
=>> I m = I n Vm > ft =>. 7? is Artinian. □ 

Remark For non-commutative rings, the definitions of left (right) Artinian rings are 
obvious. 

Theorem 7.1.4 A homomorphic image of a Noetherian ( Artinian ) ring is also 
Noetherian (Artinian). 

Proof Let / be a homomorphism of the Noetherian ring R onto the ring S'. Consider 
the ascending chain of ideals of S : 


7l C J 2 C • • • C J n • • • 


(7.2) 


Suppose I r = f 1 (/ r ), for r = 1 , 2, Then 


A c I 2 c • • • c I n c • • • . (7.3) 

Relation (7.3) becomes an ascending chain of ideals of R. Then by hypothesis, 
3 some index ft such that I m = I n Vm > ft. This shows that J m = J n Vm > ft. Hence 
the chain (7.2) becomes stationary at some point. Thus 5 1 is Noetherian. 

For the Artinian ring, the proof is similar. □ 

Corollary If I is an ideal of a Noetherian (Artinian) ring R , /Acft /7z£ quotient ring 
R/ 1 is Noetherian (Artinian). 

Proof The corollary follows from Theorem 7.1.4 by taking in particular, S = R/ 1 
and / : R R/ 1 the natural homomorphism. □ 

Theorem 7.1.5 Let I be an ideal of a ring R. If I and R/I are both Noetherian 
(Artinian) rings , then R is also Noetherian (Artinian). 

Proof Let I\ c / 2 C • •• c I n c • • • be an ascending chain of ideals of R. Let p : 
R — >► R/7 be the natural homomorphism of R onto R/7. Then p(I\) c ^(A) c 
• c /?(A) c ... is an ascending chain of ideals in R/I. Since R/I is Noetherian, 
there exists a positive integer n such that p(I n ) = p(I n+i ) for all i > 1. Also, I\ Pi 
/ c ft n / c • • • c A n / c • • • is an ascending chain of ideals in I . Since I is 
Noetherian, there exists a positive integer m such that I m D / = 7 m+ ; Pi 7 for all i > 1 . 
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Let r = ma x(m, n). Then p(I r ) = p(I r +i) and 7 r D / = 7 r+ ; Pi 7 for all i > 1 . Let a G 
7 r+i - . Then there exists an element r e/ r such that /7(a) = p(x ), i.e., a + 7 = x + 7. 
Hence a — x G 7 and also a — x G I r +i . This shows that a — x G 7 r+i - n 7 = I r Cl 7. 
Hence a — x g 7 r . Again as i e 7 n then a g 7 r . Consequently, I r = 7 r+i - for all 
r > 1. This proves that 7? is Noetherian. □ 

Definition 7.1.7 An Artinian domain R is an integral domain which is also an Ar- 
tinian ring. 

We now prove some interesting properties of Noetherian and Artinian rings. 

Theorem 7.1.6 (a) Any Noetherian domain is a factorization domain. 

(b) Any Artinian domain is a field. 

Proof (a) Let R be a Noetherian domain. Suppose R is not a factorization domain. 
Then there exists at least one non-zero, non-invertible element a in R such that a 
is not a finite product of irreducible elements of R. Let X be the set of all such 
elements x g R. As a g X, X 0. Consider the set S = {(x) : x g X}. Then S' is a 
non-empty collection of ideals of R . As R is Noetherian, S has a maximal element, 
say (y). Clearly, y g X and y is not irreducible. This implies that y = cd for some 
non-zero non-invertible elements c and d in R. Hence (y) C (c) and (y) C (J). Then 
by maximality of (y) in S , it follows that (c) and (d) are not elements of S. This 
shows that c and d are finite products of irreducible elements of R. Hence y — cd 
is also a finite product of irreducible elements of R, a contradiction as y G X. 

(b) Let R be an Artinian domain and a (^0) G R. Then for descending chain 
of ideals: {a) 2 (a 2 ) 2 (^ 3 ) ■ • ■ 5 , there exists an index n such that, ( a n ) = 
(a n + l ) = (a n + 2 ) = • • • . Hence ( a n ) = (< a n+l ) =$> a n = ra n+l for some r g R => 1 = 
ra (by the cancellation law in R as a n 0) => a is invertible in R => R is a field. □ 

Corollary 1 Every principal ideal domain is a factorization domain. 

Proof It follows from Proposition 7.1.1 and Theorem 7.1 .6(a). □ 

Corollary 2 An integral domain with only a finite number of ideals is a field. 

Proof It follows from Theorem 7.1.6(b). □ 

We recall that a ring R is Noetherian iff every ideal of R is finitely generated. To 
show that R is Noetherian it is sufficient to consider just the prime ideals of R (this 
result is due to I.S. Cohen (1917-1955)). 

Theorem 7.1.7 (Cohen) A ring R is Noetherian iff every prime ideal ofR is finitely 
generated. 

Proof => follows from Theorem 7.1.1. 
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Next assume that every prime ideal of R is finitely generated, but R fails to be 
Noetherian. This shows that the collection F of ideals of R which are not finitely 
generated is non-empty. 

Order F by inclusion. Now consider an arbitrary chain {Bt} in F. Then B = 
(J; B[ is an ideal of R such that B cannot be finitely generated. Hence B g F is an 
upper bound in F. Then by Zorn’s Lemma F has a maximal element I (say). So, by 
hypothesis, I cannot be a prime ideal of R. Then 3a, b e R such that a £ I,b £ I but 
ab g I. Clearly, both the ideals (I, b) and I : (b) (see Ex. 26 of Exercise-I, Chap. 5) 
properly contain I and a e I : (b). So maximality of I in F =>► {I, b) £ F and so, 
I : (b) £F => (/, b) = (c\, C 2 , . . . , c n ) and I : (b) = (d\,d 2 , ...,d m ) for suitable 
Ci,dj g R(i = 1, 2, . . . , n; j = 1, 2, . . . , m) => Ci = at + brt , for some <z; e I and 
ri e R=> (I, b) = {a\,a 2 , ...,a n ,b). 

Now consider the ideal J — [a\,a 2 , ... ,a n , bd\ , ... , bd m ). Now bdj e / Vj => 
J c I .To show the reverse inclusion, let x e I . 

Then 

n 

x g (/, b) => v = 7>x; + by (xi ,yeR) => by e I ( as each < 2 / g /) 
i=l 

m 

=y y g / . (Z?) y = ^ ^ dj tj ( tj g 

i=l 

n m 

=> v = + y ^(bdj)tj e J =>► / c 7. 

i=l 7—1 

Consequently, / = / =>► / is itself finitely generated => an impossibility as I G F. 
This contradiction proves the theorem. □ 

The “finiteness condition” for Noetherian (Artinian) rings has an advantage over 
arbitrary rings which makes the study of Noetherian (Artinian) rings more attractive 
and interesting. We prove the following theorems for commutative cases and non- 
commutative cases are left as an exercise. 

Theorem 7.1.8 Let R be a commutative Artinian ring with 1. Then every prime 
ideal of R is a maximal ideal. 

Proof Let A be a prime ideal of R. Then the quotient ring R/ A forms an integral 
domain. Again as R is an Artinian ring, then R/ A is also an Artinian ring. Hence 
by Theorem 7.1.6, R/A is a field. Consequently, A is a maximal ideal of R. □ 

We can determine more ideal structure of an Artinian ring R . 

Theorem 7.1.9 Every Artinian commutative ring R with 1 has only a finite number 
of prime ideals , each of which is maximal. 
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Proof If possible, there exists an infinite sequence {A; } of distinct prime ideals of R. 
Then we can form a descending chain of ideals: 

At 5 AiA2 5AiA 2 A 3 5--- . 

Since R is Artinian, 3 a positive integer n for which 


A\A 2 -" A n = AiA 2 • • • A n A n +i. 


This shows that AiA 2 • • • A n c A n +\. Then A t c A n +\ for some t < n. But A t is a 
maximal ideal of R by Theorem 7 . 1 . 8 . So we have A t = A n + i, which contradicts 
the fact that each A* is distinct. 

Consequently, R has only a finite number of prime ideals and each of them is 
maximal by Theorem 7 . 1 . 8 . □ 

We recall that the Jacobson radical of a ring R denoted by J(R) (or rad R) is the 
intersection of all maximal ideals of R. We now study the Jacobson radical of an 
Artinian ring. 

Theorem 7.1.10 The Jacobson radical of a commutative Artinian ring is the inter- 
section of finitely many maximal ideals of the ring. 

Proof Let R be an Artinian ring and S be the set of all maximal ideals of R. Let J = 
J(R) be the Jacobson radical of R. Then J is the intersection of all maximal ideals 
of R. Let F be the set of all ideals of R such that each of which is an intersection 
of finitely many maximal ideals of R. Then F is non-empty, since S c F. As R is 
Artinian, F has a minimal element Mo (say). Suppose Mq = M\ P M 2 P • • • Pi M n , 
where M* are in S. Then J c Mo. Again if M is in S, then Mo P M is in F. Hence 
by minimality of Mo, it follows that Mo Pi M = Mo. Thus Mo c M for all M in S. 
Hence J c Mo c P Me §M = J shows that ./ = Mo. □ 

Theorem 7.1.11 The Jacobson radical of a commutative Artinian ring with 1 is 
nilpotent. 

Proof Let R be a commutative Artinian ring with 1 and J = J(R) be its Jacobson 
radical. We claim that J n = {0} for some positive integer ft . Consider the descending 
chain of ideals of R: J 2 J 2 5 5 J n • • • . As R is Artinian, there exists some 

positive integer n such that J n = J m for all m > ft. Suppose I = J n . Hence I = I 2 
and I J = I. If possible, suppose I {0}. Let F be the set of all ideals A of R 
such that Ac / and I A / {0}. Then {0} / I = I 2 shows that I e F. Hence F is 
non-empty. Since R is Artinian and {0} is not in F, it follows that F has a minimal 
element, say M {0}, such that IM / {0} and M c I. Consequently, there exists 
a non-zero element y e M such that Iy {0}. Then Iy is a non-zero ideal of R. 
Moreover, I(Iy ) = I 2 y = Iy ^ {0} and Iy c M c I show that Iy e F. Then by 
minimality of M, it follows that Iy = M. Hence there exists an element x e I such 
that y = vy. Then y(l — x) = 0. Now v e / = J n c J implies that 1 — rx is a unit 
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for all r e R (see Ex. 23 of Exercises-I of Chap. 5). In particular, 1 — x is a unit in R. 
Hence y( 1 — x) = 0 implies y = 0. This is a contradiction. Hence I = J n = { 0}. □ 

We recall that if every element of an ideal I of a ring is nilpotent, then the ideal I 
is called a nil ideal. 

Corollary Every nil ideal of a commutative Artinian ring R with 1 is nilpotent. 

Proof Let A be a nil ideal of R. Then every element of A is nilpotent and hence 
Va e A, r e R, ra is nilpotent. This implies that 1 + ra is a unit in R Vr e R. This 
shows that a e rad R (by Ex. 23, Exercises-I of Chap. 5). Hence A c rad R = J(R). 

Since a nil ideal of a ring is contained in its Jacobson radical, the corollary fol- 
lows from Theorem 7.1.11. □ 

The following theorem is the same as Theorem 7.1.9. We now give an alternative 
proof. 

Theorem 7.1.12 Every commutative Artinian ring with 1 contains only finitely 
many maximal ideals. 

Proof Let R be a commutative Artinian ring 1. Then its Jacobson radical J = J(R) 
is an intersection of finitely many maximal ideals of R by Theorem 7.1.10. Let J = 
M\ n M 2 IT • • • IT M n ^ Ml M 2 • • • M n . Again by Theorem 7. 1 . 1 1 , / is nilpotent. Then 
J 1 = {0} for some positive integer t. Consequently, {0} = D (Mi M 2 • • • • M n Y = 
M\ Mf • • • Mf . Let M be a maximal ideal of R. Then M 2 (0} = Mf Mf • • • Mf 
implies that M 5 M/, for some i such that 1 < i <n. This implies that M; CM, 
since M is a maximal ideal and hence it is prime. Again since both M and M z - are 
maximal ideals of R, it follows that M — M[. This concludes that the only maximal 
ideals of R are Mi , M 2 , . . . , M n . □ 

Remark (i) Ex. 4 of Exercises-I shows that there are rings which are right Noethe- 
rian but not left Noetherian; 

(ii) Ex. 5 of Exercises-I shows that there are rings which are right Artinian but 
not left Artinian; 

(iii) Ex. 7 of Exercises-I shows that a subring of a Noetherian (or Artinian) ring 
may not be Noetherian (or Artinian); 

(iv) Z is Noetherian but not Artinian. 

So the concepts of left Noetherian and right Noetherian rings are different but 
these two concepts coincide if the ring is commutative. Again Z is Noetherian but 
not Artinian. Hence these two classes of rings are different in general. 

We now search for suitable situations under which a Noetherian ring is Artinian 
and conversely. 

Theorem 7.1.13 Let R he a commutative local ring with 1 such that its maximal 
ideal in R is nilpotent. Then R is Artinian if and only if R is Noetherian. 
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Proof Let M be the maximal ideal of R. Then, by hypothesis, = {0} for some 
positive integer t. Hence F = R/M is a field (by Theorem 5.3.2). Clearly, R is 
Artinian (or Noetherian) iff M is Artinian (or Noetherian). Again M is Artinian 
(Noetherian) M/M 2 and M 2 are Artinian (or Noetherian) M l /M l+X 

and M* +1 are Artinian (or Noetherian). But M l /M* +1 is Artinian (or Noetherian) 
M l /M l+X is a finite dimensional vector space over F for all i = 1, 2, . . . , t — 1 
(see Chaps. 8 and 9). If R is Artinian (or Noetherian), then each M l /M l+l is Ar- 
tinian (or Noetherian). Hence each is a finite dimensional vector space over the field 
F. Now each is Noetherian or Artinian. Thus M f ~ l = M r ~ l /M t and M t ~ 2 /M t ~ l 
are both Noetherian (or Artinian) show that M r ~ 2 is Noetherian (or Artinian). Pro- 
ceeding in this way we prove that M is Noetherian (or Artinian). This proves the 
theorem. □ 

Theorem 7.1.14 Every ideal in a commutative Noetherian ring with 1 contains a 
product of prime ideals. 

Proof Let R be a commutative Noetherian ring with 1 and T be the set of all ideals 
A of R such that A does not contain any product of prime ideals. If T is not empty, 
then T has a maximal element, say M. This M cannot be prime. Hence there exist 
elements x,y e R such that xy e M but x,y £ M. Consider A = M + Rx and 
B = M + Ry. Then M c A Pi B and M / A and M ^ B. Hence by maximality of 
M in T, A , B £ T . This shows that both A and B contain some product of prime 
ideals of R. Again AB = (M + Rx)(M + Ry) c M + Rxy = M, as xy e M. This 
implies that AB contains a product of prime ideals of R and hence M contains 
a product of prime ideals of R . This is a contradiction. Consequently, T must be 
empty. This proves the theorem. □ 

Corollary Let R be a commutative Noetherian ring with 1. Then {0} = P\ x Pf 1 
• • • P n tn , where Pi } s are distinct prime ideals of R and t \ , t 2 , . . . , t n are some positive 
integers. 

Theorem 7.1.15 A commutative Artinian ring R with 1 is Noetherian. Conversely , 
if R is a Noetherian ring with 1 in which every prime ideal is maximal , then R is 
also Artinian. 

Proof Suppose R is a commutative ring with 1 which is Artinian. Then by Theo- 
rem 7.1.12, R contains only finitely many maximal ideals: M i, M 2 , . . . , M n (say) 
such that its Jacobson radical J satisfies the relation {0} = J f = Mf Mf • • • M n r , 
where t is the rank of nilpotency of J . Maximal ideals M \ , M 2 , . . . , M n are pairwise 
co-prime and hence by Chinese Remainder Theorem 5.6.1 of Chap. 5 it follows that 

R = R/M\ x R/Mf x • • • x R/Mf. (7.4) 

Again each R/Mf is an Artinian local ring whose maximal ideal M z - / Mf is nilpo- 
tent. Hence each R/Mf is Noetherian by Theorem 7.1.13. Consequently, R is 
Noetherian by (7.4). 
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Next suppose R is a Noetherian ring with 1 in which every prime ideal is maxi- 
mal. Then each maximal ideal of R is also a prime ideal. Hence it follows by Corol- 
lary to Theorem 7.1.14 that {0} = 2 • • • M n tn for some maximal ideals Af z - 

and some positive integers ti . Consequently, R = R/M x R/M 2 t2 x • • • x R/M n tn . 
Proceeding as before, the converse part follows. □ 

Problem Let R be a Noetherian ring and / : R — > R is an epimorphism. Then / is 
an isomorphism. 

[Hint. For each n g N + , f n is an epimorphism and ker / c ker / 2 c ker / 3 c • • • 
is an ascending chain of ideals in R. Now R is Noetherian =>> 3m g N + such that 
ker f m — ker f m+t g N + / is a monomorphism.] 


7.2 An Application of Hilbert Basis Theorem to Algebraic 
Geometry 

The Hilbert Basis Theorem plays a very important role in algebraic geometry. We 
present in this section an application of this theorem to algebraic geometry. 

We follow the notation of Sect. 5.5 of Chap. 5. 

Theorem 7.2.1 If K n is an affine n-space , then every algebraic set in K n is deter- 
mined by a finite set of polynomials in K[x \ , X 2 , . . . , x n ]. 

Proof Let A c K n be an affine algebraic set. Then 3 a subset S of K[x \ , X 2 , . . . , x„] 
such that, A = V(S). If P is the ideal generated by S , then V(P) = V(S). Now 
P being an ideal of the Noetherian ring K[x\, X 2 , • . . , x n ] (as every field K is a 
Noetherian ring), P is generated by a finite set of polynomials in K[x \ , X 2 , . . . , x n ]. 
Then A = V(S) = V(P) is determined by a finite set of polynomials in K[x i, 

X 2 ,...,V W ]. □ 

Remark Every ascending sequence of ideals P\ c P 2 c • • • in jST[jci , X 2 , . . . , 
is stationary i.e., such that P n = P n +\ = • • • . This theorem implies in view of 
Proposition 5.5. l(i) Chap. 5 the following descending chain condition for algebraic 
sets. 

Every decreasing sequence of algebraic sets in K n : V (Pi) 5 V (P 2 ) 2 ■ ■ • in K n 
must terminate, i.e., 3n such that V (P n ) = V (P n + 1 ) = • • • . 

Then it follows that every non-empty collection U of algebraic sets in K n has 
a member M such that no member of U is properly contained in M, otherwise we 
can construct inductively a non-terminating decreasing sequence of algebraic sets 
in K n . Such a member M of U is called a minimal member. 

Theorem 7.2.2 Any algebraic set A can be expressed uniquely as a finite union of 
affine varieties'. A = A\ U A 2 U • • • U A r , so that there is no inclusion relation among 
the At ’s i.e., A/ Aj V/ 7 ^ j. 
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Proof ( Existence ) Let U be the collection of all those algebraic sets in K n which 
cannot be expressed as a finite union of affine varieties. We claim that U = 0. If 
U / 0, U has a minimal member M by the descending chain condition for algebraic 
sets. This M cannot be an affine variety, otherwise M would be a finite union of 
irreducible algebraic sets: M = A\ U A 2 and hence M would not be a member of U . 
Then 3 two algebraic sets A\=fM and A 2 / M such that M — A\ U A 2 . Since M 
is a minimal member of U, A\ £ U and A 2 £ U. Consequently, each of A\ and A 2 
can be expressed as a finite union of affine varieties, so M is a finite union of affine 
varieties. This means M £ U implies a contradiction. Therefore U must be empty. 
In other words, any algebraic set A can be expressed as a finite union of affine 
varieties: A = A\ U A 2 U • • • U A r . Moreover, if there is an inclusion relation among 
the Af s, we can omit the superfluous terms so that A* (jL A j V/ j . 

( Uniqueness ) Let A = U!=i M = Uy=i #/, where each A/ , Bj is irreducible and 
Ai c Ak i = k, Bj j = k. Each Bj can be expressed as Bj = Bj Pi A = 

Bj n (Ai U A 2 u • • • U Ar) = (Bj n Ai) U (£/ n A 2 ) u • • • U (£,• n A r ). Since each 

D Ai is an algebraic set and Bj is irreducible, we must have Bj = Bj D A m c A m 
for some m, 1 < m < r. By similar arguments, A m c for some k, 1 < k < s. 
Therefore Bj c which implies j =k. Consequently, each = A m for some m, 

1 < 777 < r . 

Similarly, each A m = Bj for some j, 1 <j<s. This proves that the representa- 
tion is unique. 

The representation A = AiUAiU---UA r is called a decomposition of A, and 
the Ai ’s are called the irreducible components of A. □ 


7.3 An Application of Cohen’s Theorem 

In this section we show that Cohen’s Theorem (proved in the earlier section) offers 
a sufficient condition for a ring to be Noetherian. 

Theorem 7.3.1 If R is a ring in which every maximal ideal is generated by an 
idempotent element , then R is Noetherian. 

Proof Let I be a primary ideal of R (see Ex. 20 of Exercise-I, Chap. 5). We claim 
that I is a maximal ideal. Otherwise, there exists a maximal ideal M such that 
I ^ M. Then, by hypothesis, M = (e) where e is an idempotent element in R 
such that e 0 or e / 1, since e — 0 => R is a field and the proof is trivial. Then 
e(l — e) = 0 e I and I is a primary ideal =>► (1 — e) n e I ^ M for some positive 
integer n ^ l — eeM = (e)=^leM=^a contradiction =>► I is a maximal ideal. 
Since every primary ideal of R is maximal, the concepts of maximal, prime and pri- 
mary ideals coincide. Hence by our hypothesis, every maximal (hence, every prime) 
ideal is finitely generated => R is necessarily Noetherian by Cohen’s Theorem. □ 
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7.4 Supplementary Examples (SE-I) 

1 Every Euclidean domain is always a Noetherian domain. 

[ Hint. D is a Euclidean domain =>- D is a PID by Theorem 6.2.1 =>► D is a 
Noetherian domain by Proposition 7.1.1.] 

2 Let Z) be an integral domain which is not a field but D is Noetherian. Then D 
contains elements which are irreducible. 

[Hint. D is not a field =>- 3 a non-zero non unit element a e D =>- a is irreducible. 
Otherwise, if a is reducible in Z), then 3 a non-zero non unit element a\ e D such 
that a\ | a and a\ is not an associate of a. Hence (a) C (a\). If a\ is not irreducible, 
repeat the process to obtain an ascending chain of principal ideals: 

{a) C (a\) ^ te) ^ • • • 

which contradicts the fact that D is Noetherian.] 

3 Every Noetherian domain is a factorization domain. 

[Hint. Let D be a Noetherian domain which is not a factorization domain. Then 

3 at least one non-zero, non unit element a (say) in D such that a is not a product 
of irreducible elements. Let A be the set of all such elements a of D. Hence A / 0. 
Consider the set S = {{a) : a e A}. Then S ^ 0. Now D is a Noetherian domain =>- S 
has a maximal element (b) (say) =>- b is not irreducible in D =>> 3 non-zero non unit 
elements c,d e D such that b = cd and c, d are not associates of b =>► (b) = ( cd ) C 
(c) and (b) ^ (d) => (c) £ S and (d) £ S (by maximality of (b)) => c and d are both 
products of irreducible elements of D => a contradiction.] 

4 Let R be a commutative Artinian ring with 1 such that | R \ > 1 and R does not 
contain zero divisors. Then R is a field. 

[Hint. Let a (^0) e R. Consider the chain (a) 5 (a 2 ) 5 (< 2 3 ) D • • • . 

Z? is Artinian =>► 3 a positive integer m such that ( a m ) = (< 2 m+1 ). Then a m (^0) e 
(^ m+1 ) =^> 3 some r e R such that a m = ra m+l ra = 1.] 


7.5 Exercises 

Exercises-I 

1. Show that the ring of integers (Z, +, •) is Noetherian but not Artinian. 

[Hint. Lor any positive integer n , the strictly descending chain (n) ^ (2n) ^ 
(An) ^ • of ideals of Z does not terminate.] 

2. Let p be a fixed prime integer. Consider the group Z (p°°) = {m/p n : 0 <m < 
p n ; m e Z, n = 0, 1, 2, . . .} under the operation of addition modulo 1. Then 
7j(p°°) is a ring (without identity) by defining the product ab to be zero Va,b e 
Z(/?°°). Show that (Z(/?°°), +, •) is an Artinian ring but not Noetherian. 

3. Let Z 7 be the ring of all real valued functions on R. Lor any positive real number 
r define I r = {/ e F : f(x) = 0 for — r < x < r}. Then show that Z r is an ideal 
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of F such that 


•••/3C/2C/1C/1/2C/1/3C.... 

Also, show that F is neither Noetherian nor Artinian. 

4. Consider the ring R= , a e Z, b , c e Q] under usual addition and multi- 
plication. Show that R is right Noetherian but not left Noetherian. 

[Hint. For any non-negative integer n , the set I n = {(jj m ^ 2 ); m £ Z} is a 
left ideal of R such that I \ ^ h ^ • • • . 

To show that R is right Noetherian, prove that every non-zero right ideal of 
R is finitely generated.] 

5. Consider the ring f? = {(^):aeQ; /?, c e R} under usual addition and mul- 
tiplication. Show that R is right Artinian but not left Artinian. 

6. (a) Let R be a commutative Noetherian ring with 1. Show that every ideal of R 

contains a finite product of prime ideals. 

(b) Show that a commutative Artinian ring R with 1 is a field O R is an integral 
domain. 

7. Show by examples that a subring of a Noetherian (Artinian) ring may not be 
Noetherian (Artinian). 

[Hint. Q[x] is Noetherian by Theorem 7.1.2 but its subring R = {f e Q[x] : 
constant term of / belong to Z} is not Noetherian, since the strictly ascending 
chain (x) C (x/2) C (x/2 2 ) C . . . 0 f ideals of R does not stabilize at any point. 
Again Z is a subring of the field Q, but Z is not Artinian by Exercise 1.] 

8. Is the statement of the Hilbert Basis Theorem true on replacement of Noetherian 
rings by Artinian rings? Justify your answer. 

[Hint. For a field F , the strictly descending chain of principal ideals of F[x] : 
(x) ^ (x 2 ) ^ (x 3 ) ^ • • • will never terminate.] 

9. An ideal I of a ring R is said to be a nil ideal iff each element x in I is nilpotent; 
I is said to be nilpotent iff I n = {0} for some positive integer n. Let R be an 
Artinian ring. Examine the validity of the following statements: 

(i) rad R is a nilpotent ideal of R ; 

(ii) every nil ideal of R is nilpotent. 

10. Let R be a left Artinian ring with identity 1 and G the group of units of R. An 
element a e R is said to be left quasi-regular iff 3r e R such that r + a + ra = 0. 
In this case, the element r is called a left quasi-inverse of a. Let J denote the 
Jacobson radical of R. 

Prove the following: 

(a) Let G* be the group of units of R/J. Then gcG<^g + /cG*; 

(b) G is a finite group => R is finite; 

(c) G is an abelian group and a, b are quasi-regular elements of R => ab = ba 
(in particular, J is commutative). 

1 1 . Examine the validity of the statement that every Artinian ring is Noetherian but 
its converse is not true. 
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12. Examine the validity of the statement that a ring R is Artinian iff R is Noethe- 
rian and every prime ideal of R is maximal. 

13. For any commutative ring R , the Krull dimension of R is the maximum possible 
length of the chain I\ c I 2 c • • ■ c I n (h 2 ft 2 • • • 2 4) of distinct prime 
ideals of R. For example, a field has dimension zero and a PID (which is not a 
field) has dimension 1 . The Krull dimension of R is said to be infinite iff R has 
an arbitrary chain of distinct prime ideals. 

Show that a ring R is Artinian iff R is Noetherian and it has the Krull di- 
mension zero. 

14. Show that any ring which is a quotient of a polynomial ring over the integers or 
over a field is Noetherian. 

[Hint. Use Corollary of Theorem 7.1.4 and Hilbert Basis Theorem.] 


7.6 Additional Reading 

We refer the reader to the books (Adhikari and Adhikari 2003, 2004; Artin 1991; 
Atiya and Macdonald 1969; Birkoff and Mac Fane 1965; Burtan 1968; Fraleigh 
1982; Fulton 1969; Herstein 1964; Hungerford 1974; Jacobson 1974, 1980; Fang 
1965; McCoy 1964; Musuli 1992; van der Waerden 1970; Zariski and Samuel 1958, 
1960) for further details. 
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Chapter 8 

Vector Spaces 


In the earlier chapters we introduced mainly two algebraic systems, such as groups 
and rings, which involve only internal operations. In this chapter, we introduce an- 
other algebraic system which involves both internal and external operations con- 
nected by some relations. A vector space is a combination of both an additive abelian 
group and a field interlinked by an external law of operation. In this chapter, we 
study vector spaces and their closely related fundamental concepts, such as linear 
independence, basis, dimension, linear transformation & its matrix representation, 
eigenvalue, inner product space, etc. Such concepts form an integral part of linear 
algebra. Vector spaces have multifaceted applications. Such spaces over finite fields 
play an important role in computer science, coding theory, design of experiments, 
combinatorics. Vector spaces over the infinite field Q of the rationals are important 
in number theory and design of experiments and vector spaces over C are essential 
for the study of eigenvalues. In this chapter, we also show how to construct isomor- 
phic replicas of all homomorphic images of a specified abstract vector space. As 
the concept of a vector provides a geometric motivation, vector spaces facilitate the 
study of many areas of mathematics and integrate the abstract algebraic concepts 
with geometric ideas. 


8.1 Introductory Concepts 

A vector space is just an algebraic system whose elements combine, under vector 
addition and multiplication by scalars from a suitable field satisfying certain condi- 
tions. 

The concept of vector spaces arose through the study of geometry and physics. 
R 2 endowed with the usual distance function is called ‘Euclidean plane’. It is also 
known as ‘Cartesian plane’ (or coordinate plane) in honor of Rene Descartes (1596- 
1650). He first identified the ordered pairs of real numbers with the points in the co- 
ordinate plane. We identify the point X = (x\ , *2) e R 2 with the arrow OX starting 
from the origin O = (0, 0) and ending at the point X. The length of the line segment 
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O X is denoted by \OX\. We define addition ‘+’ and scalar multiplication on R 2 

as follows. _ v 

Addition on R 2 is defined by the usual parallelogram law. If O X is the arrow cor- 
responding to the point X = (x\, x 2 ) £ R 2 , then for every non-zero real number r, 
the arrow r OX has length \r\ times of the length \ OX\ and whose direction is the 
same or opposite according to r > 0 or r < 0. If r = 0, then r OX is identified with 
(0, 0) and no direction is assigned. For simplicity, we use the symbol r • X (or r X) 
for r • O X. Similarly, the ordered triples (x\ , X2, * 3 ) E R 3 are identified with points 
in the 3 -dimensional Euclidean space R 3 . 

In physics, there are quantities, such as forces acting at a point in a plane, ve- 
locities, and accelerations etc., called vectors. They are also added by parallelogram 
law and multiplied by real numbers (called scalars) in a similar way. 

Using this fact, the operations of pointwise addition and scalar multiplication are 
defined on R 2 as follows: 

(*i,* 2 ) + (yi,y2) = Cu +y\,x2 + yi)\ 
r • (xi,x 2 ) = (rx\,rx2) 

for all (. x\ , X 2 ), (yi , 3 ^ 2 ) £ R 2 and r e R. Then (R 2 , +) is an abelian group. 

We consider the above scalar multiplication as a mapping /z : R x R 2 R 2 , 
(r, X) r • X, satisfying the following properties: 

1. IX = X; 

2. r • (s • X) = ( rs ) • X; 

3. (r +j)- X = r • A + A; 

4. r.(A + T) = r • A + r • F, 

for all X, Y e R 2 and r,se R. 

These properties are called vector properties of R 2 . Similar properties hold in 
the Euclidean n -space R n and n -dimensional unitary space C n for n > 1 (see 
Sect. 8.10). Each point jc = (x\, x 2 , . . . , x n ) of R n (or C n ) can be thought to rep- 
resent a vector. 

The above vector properties set up an algebra called algebra of vectors. There are 
many mathematical systems having such vector properties. These systems introduce 
the concept of vector spaces over fields. Vector spaces generalize the algebra of 
vectors. 

The vector space structure is one of the most important algebraic structures. The 
concept of a vector is essential to the study of functions of several variables. Vector 
spaces play an important role in the solution of specified problems. We now define 
vector spaces over an arbitrary field with an eye to the model of vector spaces over 
the field R of real numbers. 

Definition 8.1.1 A vector space or a linear space over a field F is an additive 
abelian group V together with an external law of composition (called scalar multi- 
plication). 

p : F x V — >► V, the image of (a, v) under /z is denoted by av, satisfying the 
following conditions: 


8.1 Introductory Concepts 


275 


V(l) Iv = v, where 1 is the multiplicative identity in F\ 

V(2) (ap)v = a(pv)\ 

V(3) (a + P)v = av + fiv\ 

V(4) a(u + v) = au + av , Va, P e F and u, v e V. 

Unless otherwise stated, by a vector space V we mean a vector space over a 
field T 7 . In particular, if F = R, V is called a real vector space and if T 7 = C, V is 
called a complex vector space. A vector space over a division ring is defined in a 
similar way. However, in this chapter, we consider vector spaces over a field. The 
algebraic properties of an arbitrary vector space are similar to the vector properties 
of R 2 , R 3 or R n . Consequently, an element of a vector space V is called a vector 
and an element of F is called a scalar. 

Example 8.1.1 (i) Let V = M m , n ( F ) be the set of all m x n matrices over a field F . 
Then V forms a vector space over F under usual addition of matrices and multipli- 
cation of a matrix by a scalar. 

(ii) Let F be a field. Then F l = F is itself a vector space over F under usual 
addition and multiplication in the field F . If n > 1, then F n is also a vector space 
over F under pointwise addition and scalar multiplication defined by 

Oi, X2, • • • , X n ) + (yi, y 2 , • • • , y n ) = (*i + yi, *2 + yi, • • • , x n + y n ) and 
a(x\, X2 , . . . , x w ) = (ax\,ax2, , ax n ), Vae / 7 and (xi, * 2 , • • • ^ *«) e 

In particular, R n is a vector space over R and C n is a vector space over C. 

Remark An element (x\, X2, . . . , x n ) of R n (or C n ) may be considered as the real 
(or complex) 1 x n matrix, called a row vector or an n x 1 matrix 

(i\ 


\XnJ 


called a column vector. 

(iii) Let K be a subfield of a field F. Then F is a vector space over K , under 
usual addition and multiplication in F . 

(iv) Let V = F[x] be the polynomial ring in x over a field F. Then F[x] is a 
vector space over F. In particular, if V^lx] denotes the set of all polynomials in 
F[x ], of degree less than a fixed integer n , along with the zero polynomial, then 
V n [x] is also a vector space over F. 

(v) (a) Let V = C ([a, b ]) denote the set of all real valued continuous functions 
on [a, b]. Then V is a vector space over R under the usual pointwise addition and 
scalar multiplication. In particular, R[x] is a vector space over R. 

(b) The set V = D ([a, b]) of all real valued differentiable functions on [a, b] is a 
vector space over R under the same compositions defined in (a). 
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(vi) Let V be the set of all sequences of real numbers (or complex numbers). 
Then V is a vector space over R (or C). 

(vii) Let V be the set of all solutions of a system of m linear homogeneous 
equations in n variables with real (or complex) coefficients. Then V is a real (or 
complex) vector space, called the solution space of the system. 

(viii) The power set P(S ) of a non-empty set S' is a vector space over Z 2 under 
the group addition: 

x + y = (x — y) U (y — x) (symmetric difference), where x — y = x \ y; 
and scalar multiplication: 


ax = x, ifa = (l)eZ 2 , 

= 0, ifa = (0)eZ 2 . 

Remark The vector spaces R n , C", M m , n ( R) and M m , n ( C) are representatives of 
many vector spaces. They have some additional properties, such as R n and admit 
inner product, Af w>w (R) admits matrix multiplication and transpose. 

We use the following notations unless otherwise stated. 

Notation 

0 represents additive zero in V ; 

—v represents additive inverse of v in V ; and 
0 represent the zero element in F. 

Proposition 8.1.1 Let V be a vector space over F . Then 

(i) aO = 0 for all a in F\ 

(ii) 0r» = 0 for all v in V ; 

(iii) (-a)v =a(—v) = — (a v) for all a in F and v in V; 

(iv) av = 0 implies that either a = 0 or v = 0. 

Proof (i) and (ii) follow by using the cancellation law of addition in V . 

(iii) follows by uniqueness of inverse in V. 

(iv) If a 0 then a~ l exists in F and hence (iv) follows. □ 

Remark Proposition 8.1.1 says that the multiplication by zero element of V or of F 
always gives the zero element of V . So we can use the same symbol 0 for both of 
them. 


8.2 Subspaces 

In the Euclidean 3-space R 3 , the vectors which lie in a fixed plane through the 
origin form by themselves a 2-dimensional vector space which is a proper subset of 
the whole space. 
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We now consider similar subsets of a vector space V which inherit vector space 
structure from V. Such subsets introduce the concept of subspaces and play an im- 
portant role in the study of vector spaces. We are interested to show how to con- 
struct isomorphic replicas of all homomorphic images of a specified abstract vector 
space. Subspaces play an important role in determining both the structure of a vector 
space V and nature of homomorphisms with domain V. 

Definition 8.2.1 Let V be a vector space over a field F. A non-empty subset U of 

V is called a subspace of V iff 

(i) (£/, +) is a subgroup of (V, +); 

(ii) for a e F and u e U, au e U. 

Clearly, for any vector space V , {0} and V are subspaces of V , called trivial sub- 
spaces. Any other subspace of V (if it exists) is called a proper subspace of V. 

An equivalent definition of a subspace can be obtained from the following theo- 
rem. 

Theorem 8.2.1 Let V be a vector space over F . Then a non-empty subset U of 

V is a subspace of V iff for every pair of elements u,v e U, au + fv e U for all 
a, f e F . 

Proof Let U be a subspace of V . Then for all a, f> e F and u,v e U, au , fiv e U 
imply au + fv e U. Converse part follows by taking a = 1 = f> and f> = 0 succes- 
sively in au + fv to show that u + v e U and au e U, respectively. □ 

Example 8.2.1 (i) The set V\ = {(x, 0) e R 2 }, i.e., the x-axis is a proper subspace 
of V = R 2 over R. Similarly, V 2 = {(0, y) e R 2 }, i.e., the y-axis is a proper subspace 
of V over R. 

(ii) All straight lines in the Euclidean plane R 2 passing through the origin (0, 0) 
are proper subspaces of R 2 . Other subspaces of R 2 are {0} and R 2 itself. 

(iii) All straight lines and planes in the Euclidean space R 3 and passing through 
the origin (0, 0, 0) are proper subspaces of R 3 . Other subspaces of R 3 are {0} and 
R 3 itself. On the other hand, R 2 is not a subspace of R 3 , because R 2 is not a subset 
of R 3 . 

(iv) V n [x] is a subspace of F[x] (see Example 8.1.1(iv)). 

(v) D([a, b]) is a subspace of C([a , b]) (see Example 8.1.1(v)). 

Proposition 8.2.1 Let V be a vector space over F and {V/}/ G / be a family of sub - 
spaces ofV. Then their intersection M = f] ieI Vi is also a subspace ofV. 

Proof Clearly, 0 e M implies that M/0. Moreover, for jc, y e M and a, /3 e F , 
ax + fy is in each Vi and hence ax + f>y e M shows that M is a subspace of V . □ 

Remark The union of two subspaces of V may not be a subspace of V . For example, 
the v-axis X and the y-axis Y are subspaces of R 2 over R , but their union IUF 


278 


8 Vector Spaces 


is not a subspace of R 2 . Because u = ( 1 , 0 ) e X and v = ( 0 , 1 ) e Y but u + v = 
( 1 , 1 )<£XUY. 

We now want to define the smallest subspace of R 2 containing XU Y. This leads 
to the concept of sum of subspaces of a vector space. 

Proposition 8.2.2 Let V be a vector space over F and V\ , V 2 be subspaces of V . 
Then U = V\ + V2 = [v\ + V2 \v\ € V\, V2 € V2) is the smallest sub space of V 
containing both V\ and V2. 

Proof Left as an exercise. □ 

Definition 8.2.2 Let V be a vector space over F and V\ , V 2 be subspaces of V . 
Then the subspace U = V 1 + V 2 is called the sum of subspaces of V\ and V2. In 
general, if V\, V2 , . . . , V n are subspaces of V, then Jf!i = 1 Vi = {Ofi + v 2 + • • • + 
v n ) : Vi e Vi} is a subspace of V, called the sum of the subspaces Vi, V2 , . . . , V n , 
denoted by V\ + V2 + • • • + V n . In particular, if V = V\ + V2 and V\ D V2 = { 0 }, 
then V is said to be a direct sum of Vi and V2 and denoted by V = V\ © V2. The 
subspace V\ (or V 2 ) is said to be a complement of V2 (or V\) in V . Similarly, V = 
Vi®V 2 ®---®V n ffiV = Vi + V 2 + --- + V n and(Vi + V2 + -" + Vi)nV/ + i = { 0 }, 
for / = 1, 2, . . . , n — 1. 

Remark A complement of a subspace of V (if it exists) may not be unique. For its 
existence in a finite dimensional vector space see Corollary 2 to Theorem 8 . 4 . 6 . 

Example 8 . 2.2 (i) If V\ is a line through the origin in R 2 , then any line through the 
origin, other than V\, in R 2 is a complement of V\ in R 2 . 

(ii) Let V\ = {(jc, 0) e R 2 }, V 2 = {( 0 , y) e R 2 } and V 3 = {(x, x) e R 2 }. Then 
R 2 = Vi © y 2 = v 2 © V3 but Vi / V3. 


An equivalent definition of direct sum of subspaces of a vector space is obtained 
from the following theorem. 

Theorem 8.2.2 A vector space V is a direct sum of its subspaces V\ and V 2 iff every 
vector v in V can be expressed uniquely as v = v\ + u 2 , where v\ e V\ and i; 2 G V 2 . 
In general, V = V\ 0 V 2 0 • • • 0 V n iff every element v of V can be expressed 
uniquely as v = v\ + v 2 + F v n , Vi G Vi, i = 1, 2 , . . . , n. 


Proof Let V = V\ © V 2 . Then V = V\ + V 2 and V\ D V 2 = { 0 }. Let v = v\ + v 2 = 
r>3 + V4, where v\,V3 e V\ and v 2 , V4 e V 2 . Then ui — V3 = U4 — u 2 e V\ H V 2 = 
{ 0 } =>► v\ = V3 and v 2 = V4. Conversely, let v e V be expressed uniquely as 1; = 
v\ + v 2 , where Vi e Vi, i = 1 , 2 . Then V = V\ + V 2 . If w e V\ Pi V 2 , then w = 
w + 0 = 0 + w show that w = 0 by unique expression of w. The proof for the 
general case is left as an exercise. □ 
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Example 8.2.3 (i) R 4 = Vi © V 2 , where V] = {(*, y, 0. 0) e R 4 } and V 2 = 

{(0, 0,z,t) eR 4 }. 

(ii) Let V n [x ] be the vector space over R defined in Example 8.1.1(iv) for 
F = R and Ui = { rx l : r e R}. Then each Ui is a subspace of V n [x], for i = 
0, 1, 2, . . . , n — 1. Moreover, each f(x) e V n [x] can be expressed uniquely as 

f (x) = ao T ci\x + • • • + a n —\x n ^ , di g R. 

Consequently, V„[x] = I/q ® U\ ® • • • ® U n - \. 


8.3 Quotient Spaces 

We continue the study of subspaces of vector spaces. 

We are interested to show how to construct isomorphic replicas of all homomor- 
phic images of a specified abstract vector space. For this purpose we introduce the 
concept of quotient spaces which plays an important role in determining both the 
structure of a vector space V and nature of homomorphisms with domain V . A sub- 
space of a vector space plays an essential role to construct a quotient space which is 
associated with the mother vector space in a natural way. Like quotient groups and 
quotient rings, the concept of quotient spaces is introduced in the following way: Let 
V be a vector space over F and U a subspace of V. Then (£/, +) is a subgroup of 
the abelian group (V, +) and hence (V/U, +) is also an abelian group. The scalar 
multiplication 


fi \ F x V/U -> V/U, (a,v + U)\-^av + U 
makes V/U a vector space over F. 

Definition 8.3.1 The vector space V/U is called the quotient space of V by U and 
the map p:V^>V/U,v\-^v + U is called a canonical homomorphism. 


8.3.1 Geometrical Interpretation of Quotient Spaces 

We now present geometrical interpretation of some quotient spaces through Exam- 
ple 8.3.1. 

Example 8.3.1 (i) Let V = R 2 be the Euclidean plane and U be the x-axis. Then 
V/U is the set of all lines in R 2 which are parallel to x-axis. This is so because for 
v = (r, t) e R 2 , v + U = (r,t) + U = {(r + x, t) : x e R} is the line y = t, which is 
parallel to x-axis. This line is above or below the x-axis according to t > 0 or t < 0. 
If t = 0, this line coincides with x-axis. 

(ii) If V = R 3 is the Euclidean 3-space and U = {(xj,0)eR 3 }, then U is the 
xy-plane and for any v = (r, t, s) e R 3 , the coset v + U represents geometrically 
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the plane parallel to xy- plane through the point v = (r, t, s) at a distance s from the 
xy -plane (above or below the xy -plane according to s > 0 or s < 0). 

(iii) Let V = R 3 and U = {(x, y, z) e R 3 : 5x — 4y + 3z = 0 and 2x — 3y + 
4z = 0}. Then for any v = (a, b, c) e R 3 , the coset v + U e V/U represents ge- 
ometrically the line parallel to the line determined by the intersection of the two 
planes: 

5(x — a) — 4 (y — b) + 3 (z — c) = 0 and 2(x — a) — 3 (y — b) + 4(z — c) = 0. 

(iv) Let V = R 2 be the Euclidean plane and W = {(x, y) e R 2 : ax + by = 0} for a 
fixed non-zero (a , b) in R 2 . The cosets of W in R 2 are given by the lines ax + by = c 
for cgR. This is so because if v e R 2 , then the coset v + W is the line parallel to 
the line ax + by = 0 and hence the cosets of W in R 2 are given by the system of 
parallel lines ax + by = c, c e R. 


8.4 Linear Independence and Bases 

We recall that W 1 is a vector space over R and the n vectors e\ = (1 , 0, . . . , 0), ^2 = 
(0, 1, . . . , 0), . . . , e n = (0, 0, . . . , 1) of R" determine every vector of R n uniquely. 
Again if we consider the vector spaces V = R[x] and V^lx] (taking F = R in Ex- 
ample 8.2.1(iv)), we find a finite number of elements, such as 1 , x, x 2 , . . . , x n_1 , 

which determine every element /(x) of V^lx] uniquely as /(x) = <xo + aqx H 1- 

a n -\x n ~ l . On the other hand, there is no finite set of elements in R[x], which deter- 
mines every element /(x) of R[x] uniquely. Such situations lead to the concept of 
linear independence and bases in vector spaces. For this purpose we first introduce 
the concepts of linear combinations and generators in a vector space. 

Definition 8.4.1 Let V be a vector space over a field F and S' be a non-empty 
subset (finite or infinite) of V . An element v e V is said to be a linear combination 
of elements of S over F iff there exists a finite number of elements v \ , V 2 , . . . , v n in 
S and ot\,ot 2 , ,a n in F such that 


n 

V = a\v\ +012V2 H \- 0 i n v n = y: otj Vi • 

i = 1 

Example 8.4.1 R 2 is a vector space over R. If S = {(3, 4), (1, 2)}, then the ele- 
ment (9, 14) is a linear combination of elements of S. This is so because (9, 14) = 
2(3, 4) + 3(1, 2). 

Proposition 8.4.1 Let V be a vector space over F and S be a non-empty subset 
ofV. Then the set L(S) of all linear combinations ofS is a subspace ofV. 

Proof As S c L(S), L(S) / 0. Let u, v e L(S). Suppose u = Y^=i a i u i an d u = 
Pj v j 1 oii, pj g F and Wj, vj g S. Then for a, f e F, au + ftv e L(S ) implies 
that L(S) is a subspace of V. □ 
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Definition 8.4.2 L(S ) is called the linear space spanned by S in V. 

L(S) is the smallest subspace of V containing all the vectors of the given set S. 

We now proceed to introduce the concept of “generators” in a vector space V. 
Let S (may be empty) be a subset of V . Clearly, S is contained in at least one sub- 
space of V, namely, V. Then the intersection of all subspaces of V containing S is 
a subspace of V by Proposition 8.2.1. It is the smallest subspace of V containing S. 
This subspace is denoted by (S). 

Definition 8.4.3 (S) is called the subspace generated by S. In particular, if 

(S) = V, then V is said to be generated by S and S is called a set of generators 
for V. 

The elements of (5) can be obtained from the following theorem. 

Theorem 8.4.1 Let V be a vector space over F and S be an arbitrary subset ofV. 
Then 

(S)={0), ifS = 0, 

= L(S), 

Proof If S = 0, the smallest subspace of V containing S is {0}. Hence (S) = {0}, 
if S = 0. If S ^ 0, we claim that L(S ) is the smallest subspace of V containing S. 
Let W be any subspace of V such that S C.W. Let v = Y%= l a i v i £ L(S), cci e F , 
Vi g S, i = 1, 2, . . . , n. Then each e W => v e W => L(S ) c W =>► L(S ) is the 
smallest subspace of V containing S => L(S) = (S). □ 

Definition 8.4.4 A vector space V is said to be finitely generated iff there exists 
a finite subset S of V such that V = (S) and S is called a set of generators of V. 
Otherwise, V is said to be not finitely generated; sometimes it is called infinitely 
generated. 

Example 8.4.2 (i) R 2 , R 3 , . . . , are finitely generated vector spaces over R. 

This is so because if S = {(1, 0), (0, 1)}, then (S) = R 2 ; if S = {(1,0, 0), (0, 1,0), 
(0,0, 1)}, then (S) = R 3 , if S = {e x = (1,0, . . . , 0),« 2 = (0, 1, . . . , 0), . . . , e n = 
(0, 0, . . . , l)}(c R"), then (5) = R". 

(ii) Set of generators of V may not be unique. For example, S = {(1, 0), (0, 1)} 
and T = {(1, 0), (0, 1), (1, 1)} are both generators of R 2 . 

(iii) If S = {1, /}, then (S) = C implies that C is a finitely generated vector space 
over R. 

(iv) R[x] is not finitely generated. This is so because any linear combination of a 
finite number of polynomials f\ (x), / 2 OO, . . . , f n (x ) is a polynomial whose degree 
does not exceed the maximum degree m (say) of the above polynomials. But R[x] 
contains polynomials of degree greater than m . 

Remark Let V be a vector space over F and S be a non-empty subset of V . 
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(a) lfS = {v}, then L(S) = {av\aeF} = (v). 

(b) If S = { u , v }, then L(S ) = {au + fiv : a, e F} = (w, u). 

(c) For any non-empty subsets A, B of a vector space V, 

(i) A is a subspace of V iff A = (A); 

(ii) AC5 implies L(A) c L(B); 

(hi) (L(A)) = L(A); 

(iv) L(AU5) = L(A) + L(5). 


8.4.1 Geometrical Interpretation 

Consider the Euclidean space R 3 . It v / 0 is a vector in R 3 , then L( v) = {av : 
aeR) represents geometrically the line in R 3 through the origin and the point v. 
Again if u, v are two non-zero vectors in R 3 such that v ^ ru for any non-zero real 
number r (or equivalently, u ^ tv for any non-zero real number t), then for S = 
{u,v}, L(S) = {au + fiv : a, f$ e R} represents geometrically the plane through the 
origin and the points u, v in R 3 . Such examples motivate to introduce the concepts 
of linear independence, basis and dimension of a vector space. 

Definition 8.4.5 Let V be a vector space over F. 

1. The empty set 0 is considered as linearly independent over F. 

2. A non-empty finite subset S = [v\, V 2 , . . . , v n } of V is said to be linearly inde- 
pendent over F iff for oq , a 2 , . . . , o? n G F, a\ v\ + a^v^ H 1- a n v n = 0 implies 

a\ = ot 2 = • - = ot n = 0. 

3. An infinite subset S of V is said to be linearly independent over F iff every finite 
subset of S is linearly independent. 

Definition 8.4.6 A subset S' of a vector space over F is said to linearly dependent 
over F iff it is not linearly independent over F . 

Remark If a subset S = [v\ , V 2 , . . . , v n } is a linearly dependent set in a vector space 
V over F , this implies that there exist oq , (* 2 , . . . , a n in F, not all 0, such that a\ v\ + 
(* 2^2 + • • • + u n v n = 0. For example, S = {(1, 0), (0, 1)} is a linearly independent 
subset of R 2 over R, while T = {(1, 0), (0, 1), (1, 1)} is a linearly dependent subset 
of R 2 over R. A linearly independent set cannot contain the vector 0. 

Remark Linear independence (or dependence) in a vector space is not a property of 
an individual vector but a property of a set of vectors. 

Example 8.4.3 Consider the vector space R 2 over R. 

1. If v ^ 0, {t;} is linearly independent. Because av = 0 ^ a = 0. 

2. If S = {(1,0), (2, 0)}, then 1(2, 0) - 2(1, 0) = (0, 0) implies that S is linearly 
dependent. 

3. A subset S containing the zero vector 0 is linearly dependent. 
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Proposition 8.4.2 If the subset S = {v\ , v 2 , . . . , v n } of a vector space V is linearly 
independent over F , then every element v is L(S) has a unique expression of the 
form 

v = oi\vi + oi 2 V 2 3 \-a n v n , oti e F. 

Proof Every element v e L(S ) is of the above form by definition. To show its unique 
expression, let 


V = Ct\V\ + a 2 V 2 3 YOi n Vn 

= + P2V2 H 1 - f n Vn- 


Then 


- fi\)v\ + (a 2 - fi 2 )v 2 H Y(d n - Pn)v n = 0 

=>- oti = fii , i = 1, 2, . . . , n. □ 

Theorem 8.4.2 Let S = {v\, v 2 , • • • , v n } be a finite non-empty ordered set of vec- 
tors in a vector space V over F . Ifn = 1, then S is linearly independent iffv\ ^ 0. If 
n > 1 and v\ 0, then either S is linearly independent or some vector v m (m > 1) 
is a linear combination of its preceding ones , v \ , v 2 , • • . , v m -%* 

Proof If S is linearly independent, there is nothing to prove. So we assume that S is 
linearly dependent. Then there exist cc\ , a 2 , . . . , ct n (not all 0) in F such that 

a \V\ + CC 2 V 2 H YOi n Vn =0. 

Let m be the largest integer such that a m ^0. Then oq = 0 for all i > m. Hence 
a\v\ + a 2 v 2 H Yoc m - \v m -\ +a m v m =0 


implies that 

v m =a~ l (-a\vi -a 2 v 2 oi m - \v m -\) 

= + (-o'“ 1 q' 2 )^2 H h i)v m -i- 

This proves the theorem. □ 

This theorem gives a series of interesting corollaries. 

Corollary 1 Let V be a vector space over F and S = {v\, v 2 , . . . , v n ) a linearly 
independent subset ofV.IfveV , then S U {n} is linearly independent iff v is not 
an element of L(S). 

Proof If S U [v] is linearly independent, then v / 0. Suppose v e L(S). Then 

3a i e F such that v = a\ v\ + a 2 v 2 3 \~a n v n . Since v / 0, a \ , . . . , a n are not 

all 0. Hence a\v\ -\-a 2 v 2 3 Ya n v n + (— l)v = 0 => S U { 1 ;} is linearly dependent 

over F which is a contradiction. Hence v £ L(S). 
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Conversely, suppose v £ L(S). If possible, suppose S U {v} is linearly dependent. 
Then by Theorem 8.4.2, v e L(S ), which is a contradiction. Hence S U {n} is linearly 
independent. □ 

Corollary 2 Let the subset S = {v\, V 2 , . . . , v n } of the vector space V be such 
that U = L(S). If {v\, i> 2 , . . . , v m } (m < n ) is a linearly independent set , then there 
exists a linearly independent subset T of S of the form T = {iq, iq, • • . , v m , v t \, 
Vt2, • • • , Vtr} such that U = L(T). 


Proof If S is linearly independent, then we take T = S. If not, we take out from 
S the first v m which is not a linear combination of its preceding ones. Then 
v\, i> 2 , . . . , v t are linearly independent for t < m. Clearly, B = (tq, iq, . . . , n m _i, 
u m +i, . . . , v n } has n — 1 elements and L(B ) c U. To show the equality, let u e U . 
Then it can be expressed as a linear combination of elements of S. But in this linear 
combination, we can replace v m by a linear combination of v\, V2, . . . , v m -\. Thus 
u is a linear combination of v \ , V2, . . . , v m -\, v m +\ , . . . , v n . 

Continuing this taking out process we obtain a subset T = {iq, . . . , v m , v t i, 
v t 2 , • • • , v tn } (say) of S such that L(T) = U and in which no element is a lin- 
ear combination of the preceding ones. T is clearly linearly independent by The- 
orem 8.4.2. □ 

Corollary 3 Let V be a finitely generated vector space. Then V contains a linearly 
independent finite subset T of V such that L(T) = V. 

Proof Since V is finitely generated, there exists a subset S = {iq, V2, . . . , v n } of V 
such that V = L(S). By using Corollary 2, we can find a linearly independent subset 
T of S' such that L(T) = V . □ 

Definition 8.4.7 Let V be a non-zero vector space over F. A non-empty subset B 
of V is said to be a basis of V over F iff 

(i) B is linearly independent over F\ and 

(ii) V = L(B). 

If V = {0}, the empty set 0 is considered to be a basis of V. 

Proposition 8.4.3 If a vector space V is finitely generated and S = {iq , V 2 , . . . , v n } 
is a subset of V such that V = L(S), then a subset B of S forms a basis ofV. 

Proof It follows from Corollary 3. □ 

The following theorem proves the existence of a basis of an arbitrary vector 
space. 


Theorem 8.4.3 (Existence Theorem) Every vector space V over F has a basis. 
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Proof If V = {0}, then 0 is considered to be a basis of V. Next suppose V / {0}. 
Let S be the family of all linearly independent subsets of V i.e., S = {X c V : 
X is linearly independent over T 7 }. Since every non-zero element of V forms a lin- 
early independent subset, it is in S and hence V is non-empty. We order S partially 
under set inclusion. By Zorn’s Lemma, S has a maximal element, say B. Then B 
is linearly independent. We claim that L(B) = V. Otherwise, 3 an element v e V 
such that v £ L(B). Then B U {v} is linearly independent by Corollary 1 to Theo- 
rem 8.4.2. This contradicts the maximality of B. Hence B is a basis of V. □ 

We now present an alternative criterion for a basis. 

Theorem 8.4.4 A non-empty subset B of a vector space V is a basis ofV iff every 
element of V can be expressed uniquely as a linear combination of elements of B. 

Proof First suppose that B forms a basis of V . Then B is linearly independent and 
V = L(B). Hence every element v e V has a unique expression in the desired form 
by Proposition 8.4.2. 

Conversely, let every element of V be expressed uniquely as a linear combination 
of elements of B. Then V c L(B ) c V =>> V = L(B). From the unique expression 
of 0, it follows that B is linearly independent. Hence B is a basis of V. □ 

Theorem 8.4.5 Let V be a non-zero vector space over F. 

(a) Let S be an arbitrary linearly independent subset ofV.IfS is not a basis ofV , 
then S can be extended to a basis of V . In particular ; every singleton set of a 
non-zero vector of V can be extended to a basis ofV. 

(b) If S is a subset of V such that V = L(S ), then S contains a basis ofV. 

(c) If V has a finite basis B consisting of n elements , then any other basis of V is 
also finite consisting of n elements. 

(d) Cardinality of every basis of V is the same. 

Proof Let V be a non-zero vector space over F. 

(a) Let T be the family of all linearly independent subsets S of V containing any 
linearly independent subset A of V i.e., T — {S c V : S is linearly independent and 
A c S, where A is any linearly independent subset of V}. Then T is non-empty. 
This is so because all singleton sets of non-zero elements of V are in T. We order 
T partially by set inclusion. By Zorn’s Lemma, T has a maximal element B (say). 
B is clearly linearly independent. Moreover, L(B) = V . Otherwise, we contradict 
the maximality of B . Hence B is a basis of V . The last part follows immediately as 
every non-zero vector is linearly independent. 

(b) Let S be any subset of V such that V = L(S) and P’s = {B c S : 
B is linearly independent}. Proceeding as in (a), the maximal element of Ps forms 
a basis of V. 

(c) Let B = {v\, V 2 , . • . , v n ] = [vf] be a finite basis of V with n elements and 
C = {uj : j e J] be any other basis of V, where J is an indexing set. We claim 
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that C is finite and if C has m elements, then m = n. If C is not finite, we reach 
a contradiction, because each Vi is a linear combination of certain uf s and all the 
uf s occurring in this way form a finite subset S of C. If we assume that C is not 
finite, then there exists an element ujq (say) in C which is not in S. But ujq is 
a linear combination of the u;’s and hence Ujo is also a linear combination of the 
vectors in S. This implies by Corollary 1 to Theorem 8.4.2 that S U [ujo] is a linearly 
dependent subset of C. But this is not possible, since C is a basis of V. Thus one 
concludes that C is finite. 

Suppose C = {u\, U 2 , . . . , u m ] for some positive integer m. Then we prove that 
m — n. Since B is a basis of V, u\ can be expressed as a linear combination of 
v \ , i>2 , . . . , v n and hence the set S\ = {u\, v\, V 2 , . . . , v n ] is linearly dependent. Then 
by Theorem 8.4.2, there is some v t / u\ in B such that v t is a linear combination of 
the vectors in Si preceding it. If we delete v t from Si then the remaining set S2 = 
{u 1, r>i, . . . , v t ~i, Uj+i, . . . , v n ) is such that L(S 2 ) = V. Again U 2 G C is a linear 
combination of vectors in S2 and the set S3 = [u\, U 2 , v \ , . . . , v t -\ , v t , v t +\ , ,v n } 
is linearly dependent and such that L(S 3) — V, where v t denotes v t deleted. In 
this process, we include one vector uj from C and delete one vector V[ from B. 
Continuing this process, we cannot delete all the Vi ’s before the uf s are exhausted. 
This is so because in that case, the remaining uf s would be linear combinations of 
vectors of C already used. This contradicts the linear independence of uf s. This 
implies that n cannot be less than m. Similarly, m cannot be less than n. So we 
conclude that m—n. 

(d) Let B — {vi : i el} and C = {uj : j e J} be two bases of V (finite or infinite). 
If any one of B or C is finite, then the other is also finite and they have the same 
number of vectors by (c). We now consider the possibility when both B and C 
are infinite. Since C is a basis, each e B can be expressed uniquely as a linear 
combination with non-zero coefficients of some uf s (finite in number), say, Vi = 
ot\Uj\ + • • • + a n Uj n . Moreover, every uj appears in at least one such expression, 
because, if some ujo does not appear in any such expression, then the basis B shows 
that Ujo is a linear combination of some Vi ’s and hence of some uf s each of which 
is different from Ujo. This is not possible, as the uf s are linearly independent. 
Consequently, corresponding to each Vi e B, there exists a finite non-empty set S Vj 
of Uj 9 s such that C = S Vi . By the property of cardinality (see Chap. 1), it 

follows that card C < card B. Again interchanging the roles of vectors of B and C, 
it follows that card B < card C. Hence card B = card C. □ 

The above theorem leads to the definition of the dimension of a vector space. 

Definition 8.4.8 Let V be a non-zero vector space over F. The cardinality of every 
basis of V is the same and this common value is called the dimension of V, denoted 
by dim/7 V or simply dim V. The vector space V is said to be finite dimensional or 
infinite dimensional according to V having a finite basis or not. If B has a basis of 
n elements, then dim V = n. On the other hand, if V = {0}, then V is said to be 
0-dimensional or said to have dimension 0. 


Theorem 8.4.5 gives the following series of corollaries. 
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Corollary 1 Let V be a vector space over F . If B = {v\, v 2 , . . . , v n } is a maximal 
set of linearly independent vectors ofV , then B forms a basis of V . 

Corollary 2 Let V be a finite dimensional vector space over F . If dim V = n and 
B = {v\, v 2 , . . • , v n } is linearly independent , then B forms a basis of V . 

Corollary 3 Let V be a finite dimensional vector space over F and U be a sub- 
space of V such that dim V = dim U. Then U = V . 

Corollary 4 Let V be a vector space of dimension n. If S = [v\, V 2 , . . . , v m j is a 
linearly independent subset of V and n > m, then there are vectors v m +\, . . . , v m + t 
in V such that B = {iq, v 2 , . . . , v m , f m +i, . . . , v m +t} forms a basis of V , where 
t = n — m. 

Remark Any linearly independent subset of a finite dimensional vector space V can 
be extended to a basis of V. 

Corollary 5 Let U be a subspace of a finite dimensional vector space V and 
dim V = n (n > 1). Then U is also finite dimensional and dim U < n. Equality 
holds iff U = V. 

Theorem 8.4.6 Let A and B be subspaces of a finite dimensional vector space V . 
Then A + B is also finite dimensional and dim(A + B) = dim A + dim B — 
dim (A 0 5). 

Proof Clearly, A Pi B is finite dimensional. If A n B / jO), then it has a 
basis S = {v\, V 2 , . . . , v n }, say. This basis can be extended to a basis Sa = 
{ v\ , V 2 , • • • , v n , v n +\, , v n+t } of A and to a basis Sb = {v\, V 2 , . . . , v n , w n + 1 , . . . , 
w n + q } of B. Then dim A = n + t, dim B —n + q and dim A D B = n. Let U = 
L({ v n+ \,...,v n+t }). Then U H B = {0} and C = {v\, v 2 , . . . , v n , v n +u • • • , v n + t , 
w n +\, . . . , w n+q } is linearly independent and L(C) = A + B. Consequently, 
C forms a basis of A + B . Hence dim(A + B) = n + 1 + q = (n + 1) + (n + q) — n = 
dim A + dim B - dim(A D B). □ 

Corollary 1 For any two subspaces A and B of a finite dimensional vector 
space V , dim(A + B) < dim A + dim B. Equality holds iff A Pi B = {0}. 

Corollary 2 Let V be a finite dimensional vector space and Abe a sub space ofV. 
Then there exists a subspace B of V such that V = A ® B and dim V = dim A + 
dim B. 

Proof Let B a = {v\, v 2 , . . . , v n } be a basis of A. Then B a can be extended to a basis 
By = {ui, v 2 , . . . , v n , mi, u 2 , . . • , Up} of V. Let B = L({u\, u 2 , . . . , u p }). Then V = 
A + B and A D B = {0}. Hence V = A ® 5. Then dim V = dim A + dim#. □ 
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If V is a finite dimensional vector space and W is a subspace of V , what is the 
dimension of V / W ? The following theorem gives its answer. 

Theorem 8.4.7 Let W be a subspace of a finite dimensional vector space V . Then 
dim( V / IT) = dim V — dim W. 

Proof Let dim V = n, dim W — m and Bw = {uj l, u>2, . . . , tu m } be a basis of W. 
We now extend to a basis 5 = {uq, W2, ... ,w m ,v V2, . . . , v t } of V such that 
dim V = m-\-t = n. Then A = {v\ + W, iq + W, . . . , v t + W] forms a basis of V/W. 
Clearly, dim( V /W) = t = n — m. □ 

Corollary Let W be a subspace of a finite dimensional vector space V. Then {t> i + 
W,V2 + W,...,v m + W} constitutes a basis of V/W iff{v \ , V2 , . . . , v m } constitutes 
a basis ofW. 


8.4.2 Coordinate System 

Proposition 8.4.2 leads to extend the concept of usual coordinate system in the Eu- 
clidean plane R 2 or Euclidean space R 3 to any finite dimensional vector space. 

Definition 8.4.9 Let V be an n -dimensional vector space over F and B = 
{v \ , V2, . . . , v n } be a basis of V. Then the unique representation of v in V as 

i; = ct\v\ + ct2V2 H \-a n v n , with oti e F, 

determines the unique n -tuple (a\,oi2 , ... ,a n ) e F n , called the coordinate of v re- 
ferred to the basis B i.e., referred to axes OX i, OX 2, . . . , OX n (each extended in- 
finitely on both sides) as coordinate axes, where X \ , X2 , . . . , X n are the points cor- 
responding to the vectors iq, V2 , . . . , v n respectively, called reference points along 
these axes. 

Remark The i th coordinate oti of a point v is determined only from the entire coor- 
dinate system and not from 1 ; z - alone. The coordinates of v in V depend on the choice 
of the basis of V like the choice of coordinate axes in R 2 or R 3 . Corresponding to 
different bases of V, we get different coordinates of the same vector in V. 


8.4.3 Affine Set 

In R 3 all non-trivial subspaces are the lines through the origin and the planes through 
the origin. But in geometry we also study the planes and lines not necessarily pass- 
ing through the origin. We now introduce a concept called affine set which is closely 
related to the concept of a subspace. 
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Definition 8.4.10 Let V be a vector space over F. A subset A of V is said to be 
an affine set iff either A = 0 or A = v + W (a coset) for some subspace W of V 
and some v eV and an affine combination of vectors v \ , V 2 , . . . , v n of V is a linear 
combination of the form ot\ v\ + g? 2 t>2 H \-ot n v n such that ot\ + 012 H \-a n = l. 

Example 8.4.4 A convex combination of x and y in R 2 has the form tx + (1 — t)y 
for all real t > 0. An affine combination of points v \ , V 2 , . . . , v m in R n is a point v = 

a\v\ + (X 2 V 2 H b ot n v m , where oti e R and Jf?=i oii = 1. A convex combination 

is an affine combination for which oti > 0 for all i . 

Remark A subset A of Euclidean space is called an affine set iff for every pair of 
distinct elements x, y in A, the line determined by v and y lies entirely in A. By 
default, 0 and one-point subsets are affine. 

Definition 8.4.11 An affine set A c R n is said to be spanned by {r>o, v\ , . . . , v n } c 
R n iff A consists of all affine combinations of these vectors. 

Definition 8.4.12 An ordered set of vectors {vo, v \, . . . , v m ) c R n is said to be 
affinely independent iff {v\ — vo, V 2 — vo, . . . ,v m — vo} is a linearly independent 
subset of R n . 

Definition 8.4.13 Let S = {r>o, v \ , . . . , v m } be an affinely independent subset of R n . 
Then the convex set spanned by this set S , denoted by ( S ), is the set of all affine 
combinations of the vectors in S i.e., a point v is in (S) iff v can be expressed 
uniquely as 

v = aovo + ot\v\ H b ot m v m , (8.1) 

where otf s are all non-negative real numbers such that ao + a\ + ot2 + • • • + 
a m = 1. This gives a unique ( m + 1) -tuple (ao, a\, ..., a m ) with Jfoti = 1 and 
v = Jf?=o a i v i • + 1) -tuple is called the bary centric coordinate of v relative 

to the ordered set S. 

Definition 8.4.14 Let S = {v 0 , iq , . . . , } be an affinely independent subset of R n . 
The convex set (S) is called the (affine) m - simplex with vertices de- 

noted by (t>o, v\, . . . , v m ). 

Example 8.4.5 A 0- simplex is a point, 1 -simplex is a closed line segment, 2- 
simplex is a triangle (with its interior points), 3-simplex is a tetrahedron (solid). 


8.4.4 Exercises 

Exercises-I 

1. Let V be a non-zero vector in R 2 . Then the subspace W = {rv : r e R} is the 
straight line passing through the origin. 
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[Hint. Let ( a , /3) represent a vector v is R 2 . Then the line passing through the 
origin (0, 0) and ( a , /3) is ^5^ = Hence any point on the line is given by 
( ra , r/3 ) for r e R.] 

2. Let 1/ be a subspace of a vector space V and d e f be a fixed vector. Then the 
set S defined by S = v + U is an affine set. 

3. Let fV = {(v, y) : ax by = 0} for a fixed non-zero (a, b) in R 2 . Then W is a 
one-dimensional subspace of R 2 and the cosets of W in R 2 are given by the lines 
ax + by = c , for c e R. 

4. Let W = {(jt,y,z) e R 3 : y = z}. Then TV is a subspace of R 3 such that 
dim W = 2. 

5. Let M 2 (R) be the vector space of all 2 x 2 matrices over R and S be the set of 
all symmetric matrices in M 2 (R). Then dim S = 3. 

6. Let V be an n -dimensional vector space. Then the following statements are 
equivalent: 

(a) B = {v\, V 2 , . . . , v n j is a basis of V; 

(b) B = [v\, V 2 , • • • , v n j is a maximal linearly independent set; 

(c) B = {v i , u 2 , . . . , u n } is a minimal generating set. 


8.5 Linear Transformations and Associated Algebra 

The central problem of linear algebra is to study the algebraic structure of linear 
transformations. 

Like homomorphisms of groups and rings, a linear transformation is a map pre- 
serving the algebraic structures of vector spaces. This concept is at the heart of linear 
algebra. Many problems of mathematics are solved with the help of linear transfor- 
mations. The importance of vector spaces is based on mainly the linear transfor- 
mations they carry, because many problems of algebra and analysis, when properly 
posed, may be reduced to the study of linear transformations of vector spaces. For 
example, linear transformations play an important role in the study of matrices, dif- 
ferential and integral equations and integration theory etc. The aim of this section is 
to extend the study of vector spaces with the help of linear transformations. 

Definition 8.5.1 Let V and W be vector spaces over the same field F. A transfor- 
mation T from V to W is a mapping T :V — > W such that 

L(i) T(x + y) = T(x) + T (y), for all x, y in V (additivity law); 

L(ii) T (ax) = aT(x), for all v in V, and for all a in F (homogeneity law). 

Conditions L(i) and L(ii) can be combined together to obtain an equivalent con- 
dition: 

L(iii) T (ax + /3y ) = aT (x) + /3T(y ), for all a, /3 in F and for all x,y eV. 

In particular, if V = W, a linear transformation T : V — ► V is called a linear 
operator. 
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A linear transformation T : V — > W is called a monomorphism, epimorphism or 
isomorphism according to T being injective, surjective or bijective. 

Remark Every linear transformation is a homomorphism of the corresponding ad- 
ditive groups. 

Example 8.5. 1 (a) Each of the following examples is a linear transformation (LT): 

(i) The identity map Id : V -> V, x i-> x, for every vector space V is an LT. 

(ii) For a e R, T a : R 2 — > R 2 , (x, y ) i-> (ax, ay) is an LT. 

(iii) 7j : R 2 — > R 2 , (v, y) (y, x), i.e., 7j reflects R 2 about the line y = v is an 
LT. 

(iv) r 2 : R 2 ^ R 2 , (xc, y) i — > (xc, 0), i.e., T 2 projects R 2 onto the v-axis is an LT. 

(v) Let C([0, 1]) be the vector space of all real valued continuous functions de- 
fined on [0, 1]. Then the map 

I : C([0, 1]) -»■ R, /h» [ l f(x)dx 

Jo 

is an LT. 

(vi) For any matrix M e M mjl (R), the map Tm : R 77 — > R m , At ^ MX, X viewed 
as a column matrix, is an LT. 

The usual dot product may be viewed as a special case of Tm , where Tm : 
R 77 ->R,Xh> MI for every row matrix M i.e., if 


M = (<zi , <z 2 , . . . , a n ) and A = 


/*i\ 

*2 


then Tm(X) = 01*1 + a 2 v 2 + b a n x n is an LT. 

(vii) Let V be the vector space of all real valued functions in x having derivatives 
of all orders. Then 

d , df . 

— = D:V^V, f i-» f'{x) = — is an LT. 
dx dx 

(viii) Let V be the vector space of all real valued functions / in n -variables 
x \ , X 2 , . . . , x n admitting for all i . Then the gradient of / defined by 

grad f{X)=( \ is an LT. 

s Va*l 3 Xn) 

(b) The map T : R 2 -> R 1 , (x, y) i-> sin(x + y) is not an LT. 

This is so because 0 = T(tt, 0) = T ((f , 0) + (§, 0)) = 2 T (f , 0) = 2. 

Like groups, we define kernel and image of a linear transformation of vector 
spaces. 
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Definition 8.5.2 If 7 : V — > W is a linear transformation, then ker 7 = {v e V : 
7(n) = 0} c V is called the kernel of 7 and Im 7 = {7(u) : v e V] c IV is called 
the range or image of 7. 

Clearly, ker 7 is a subspace of V, called the null space of 7 and Im 7 is a sub- 
space of W, called the range space of 7. 

Interpretation If 7 is a left multiplication by a matrix M (see Example 8.5.1(vi)), 
then ker 7 is the set of solutions (in K n ) of the homogeneous linear equation MX = 
0 and Im7 is the set of all vectors C (in R m ) such that MX = C has a solution. 

Example 8.5.2 (i) If T : R 3 -> R 2 , (x, y, z) i-> (x, y), then ker T = {(x, y, z) e 
R 3 : T( x, y, z) = (0, 0)} = {(0, 0, z) e R 3 } is the z-axis in R 3 and Im T is the whole 
Euclidean plane R 2 . 

(ii) If T : M 2 (R) -> R 4 , j) ^ («, b, c, d ), then T is a linear map such that 
ker 7 = {(qo)} and Im7 = R 4 . 

We now establish that an n -dimensional vector space over F and F n are isomor- 
phic as vector spaces. 

Theorem 8.5.1 Let V be an n -dimensional vector space over F (n > 1). Then V 
is isomorphic to the vector space F n . 

Proof Let W = F n and B = {v\ , . . . , v n } be a basis of V. Then every vector v in 

V can be expressed uniquely as v = a\ v\ -| \-a n v n ,ai e F. Using this coordinate 

of v referred to B , we define the map 7 : V — >► F n , v i-> (or i , cy 2 , . . . , a n ). Then 7 
is an isomorphism of vector spaces. □ 

Remark The isomorphism 7 depends on the basis B and also on the order of the 
elements of B . 

Corollary Any non-zero n -dimensional vector space V over R (or C) is isomorphic 
to K n (or C n ) as vector spaces. 

For example, the vector space V n [x] (see Example 8.1.1(iv)) over R and the 
vector space R n have the same algebraic structure as vector spaces. 

For a given subset X of a vector space V over 7, the set M(X) of all 7-valued 
functions on X which vanish outside any finite set is a vector space over 7 un- 
der usual addition and scalar multiplication. Using this fact we generalize Theo- 
rem 8.5.1. 

Theorem 8.5.2 Let V be a non-zero vector space over F .If B is a basis ofV , then 

V is isomorphic to the vector space M(B). 

Proof Let B = {vi : i e 1} be a basis of V. We now construct an isomorphism of V 
onto M(B) by assigning to each vector v in V an 7 -valued function f v defined on 
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B as follows: If v ^ 0, then v can be expressed uniquely as 

v = a\ Vi { + a 2 Vi 2 + • • • + ot n Vi n , where a \ , a 2 , . . . ,ct n 

are non-zero elements in F, S = {i>q, i>/ 2 , . . . , Vi n ] is a finite subset of B and f v is 
defined by 

fviVij ) = 0, if Vij lies outside the set S; 

= otij , if Vij lies inside the set S. 

Then the map i/s : V -* M(B), v f v is an isomorphism of vector spaces. □ 

We now give another extension of Theorem 8.5.1. 

Theorem 8.5.3 Let V be a non-zero vector space over F . Then V is isomorphic to 
the direct sum of copies of F . 

Proof Let B = {vt : i e 1} be a basis of V. It is sufficient to prove that V = 
©is/ Fvi, where 

F vt = {avi :a e F} and Fvt = F for every i © /. 

Clearly, Fvi is a non-empty subset of V , since Vi = 1 • u,- © Ft>j. Again Fi>j is a 
subspace of V for each Vi e B. Let t; © V. Then v can be expressed uniquely as 

v = a/ Vi , where / is a finite subset of I. 
ieJ 

Since otiVi © Fu*, V/ © / (hence V/ © /), t; can be expressed uniquely as u = 
«i, where iq- © Ft;/. Thus 

v = u i > f° r z ^ J 
ieJ 

= 0, for all i outside J. 

Hence V = ©/ G/ Fuj where each 7/ : F ^ Fvi, a on ;/ is an isomorphism. □ 

Remark Theorems 8.5. 1-8. 5. 3 show that an abstract vector space V seems to have 
a nice representation for study. For example, an n -dimensional vector space over 
R may be studied through R n , which is easier to visualize. But such representa- 
tion loses much of its importance, because it depends on the choice of a basis B 
of V . Moreover, almost all the vector spaces of greatest importance carry additional 
structure (algebraic or topological), which need not be related to the above isomor- 
phisms. For example, the product of two polynomials in the vector space V n [x] is 
defined in a natural way but there is no similar concept in R", although V n [x] and 
R" are isomorphic as vector spaces. We now give some interesting properties of 
linear transformations. 
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Proposition 8.5.1 Let T : U — > V be a linear transformation. Then 

(i) 7X0) = 0; 

(ii) T (— zz) = —T (zz) for all u in U ; 

(iii) T is injective iff kerT = {0}. 

Proof Left as an exercise. □ 

Theorem 8.5.4 Let U and V be finite dimensional vector spaces over F and B = 
{u i, U 2 , . . . , u n } be a basis ofU. 

(i) IfT : U — > V is a linear transformation , then 

T ( B ) = T (U 2 ), . . . , T ( u n )} generates ImL 

Moreover ; zjfker T = {0}, zTzerz T (B) constitutes a basis of\mT. 

(ii) For arbitrary non-zero elements v \ , i> 2 , . . . , v n in V, there exists a unique linear 
transformation T : U — > V such that T(uf) = Vi, i = 1, 2, . . . , n. 

(iii) T defined in (ii) is injective iff the vectors v\, V 2 , • • . , v n are linearly indepen- 
dent. 

Proof (i) If v e ImT, 3 some u e U such that T ( u ) = v. As B is a basis of U , 
3oti,oi2 , .. .,a n G i 7 such that zz = Y%= l a i u i- This shows by linearity of T that 
Im T is generated by T (B). If ker T = {0}, then T is a monomorphism and hence 
T(B) is linearly independent and forms a basis of Im T . 

(ii) Left as an exercise. 

(iii) Let vi,V 2 ,---,v n be linearly independent and u e ker T. Then u = 

Y^li=\ a i u i f° r some at e F shows that T{u)=Y^i=\ a iT( u i) = 0. Hence 
YTi=\ a i v i = 0 => oii =0, Vz =>► kerT = {0} T is injective. Its converse part 
follows from (i). □ 

Remark Let U be a finite dimensional vector space. Then any linear transformation 
T : U — >► V is completely determined by the action of T on a basis of U. 

Like groups and rings, the homomorphism theorems are obtained for vector 
spaces. 

Theorem 8.5.5 (First Isomorphism Theorem) Let T : U — >► V be a linear transfor- 
mation. Then the vector spaces U / ker T and Im T are isomorphic. Conversely , ifU 
is a vector space and W is a subspace of U , then there is a linear transformation 
from U onto U/W. 

Proof Proceed as in groups. □ 

Corollary 1 If a linear transformation T : U — >► V is onto , then the vector spaces 
U / ker T and V are isomorphic. 
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Corollary 2 Let V be a finite dimensional vector space and T : V —> V a linear 
operator. Then T is injective ijf T is surjective. 

Proof By the First Isomorphism Theorem, V/ker T = Im T — U (say). Then 
dim U = dim V — dim(ker T). Clearly, T is surjective <^V = Im T = U ^ ker T = 
{0} is injective. □ 

Definition 8.5.3 Let T : U — > V be a linear transformation. Then dim(ImT) is 
called the rank of T and dim (ker T) is called the nullity of T . 

If U is finite dimensional, then ker T and Im T are both finite dimensional and 
their dimensions satisfy the following relation. 

Theorem 8.5.6 (Sylvester’s Law of Nullity) Let U be a finite dimensional vector 
space and T : U — > V be a linear transformation. Then diml/ = dim(kerT) + 
dim(Im T), i.e ., dim U = nullity ofT-\- rank ofT. 

Proof Let dim U — n and dim (ker T) = k < n. We claim that dim(Im T) = n — k. 
Clearly, dim(kerT) = k implies that there exists a basis B = {u\, U 2 , . . . ,Uk} of 
ker T . We now extend the basis B to abasis Bjj = {u\, U 2 , . . . , w i, w 2 , . . . , vo n -k } 
of U. Let B\ = {T (w\) , . . . , T (w n -k)} . Then B\ is linearly independent and gener- 
ates Im T . Consequently, B\ is a basis of Im T and hence dim(Im T)=n — k. □ 

This theorem gives a series of corollaries. 

Corollary 1 Let U be an n -dimensional vector space and T : U — > V a linear 
transformation. Then the following statements are equivalent'. 

(i) T is injective ; 

(ii) rank of T =n ; 

(iii) B = {u\, U 2 , . . . , u n } is abasis ofU implies T ( B ) = {T (u 1 ), T ( U 2 ), . . . , T ( u n )} 
is a basis of Im T i.e., dim U = rank of T. 

Proof Left as an exercise. □ 

Corollary 2 Let V be a finite dimensional vector space and T : V — >► V a linear 
operator. Then T is injective ijfT is surjective. In particular ; a linear transformation 
T : R n ^ R n is injective ijf T is surjective. 

Proof Using the First Isomorphism Theorem it follows that dim(T (V)) = dim V — 
dim(ker T). Hence T is injective iff T is surjective, because V is finite dimensional. 
For V = R n , the last part follows. □ 

Corollary 3 Let U and V be finite dimensional vector spaces and T : U — > V a 
linear transformation. Then T is an isomorphism dim U = dim V . 
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Proof Left as an exercise. □ 

Let U and V be two vector spaces over the same field F . We are yet to define any 
algebraic structure on the set L(U, V) of all linear transformations from the vector 
space U to the vector space V. The map 0:U ^ V, mkOgV (i.e., it carries every 
element u e U to the zero element of V) and the negative of a linear transformation 
T : U -> V, denoted by (-T) defined by (-T) : U -> V, u i-> — T(u) are linear 
transformations. 

Proposition 8.5.2 L(U , V) forms a vector space under the compositions : (T + 
S)(u) = T(u) + S(u) and (aT)(u) = aT (w), where u eU and T, S e L(U, V). 

Proof Left as an exercise. □ 

Proposition 8.5.3 The composite of two linear transformations is a linear trans- 
formation. 

Proof Let U, V and W be vector spaces over F; S : U — > V and T : V — >► W be 
linear transformations. It is sufficient to prove that their composite T o S \U — > W 
defined by (T o 5 )(jc) = T (S(x)) for all x in 1/ is also a linear transformation. Since 
(T o S)(ax + y0y) = aT(S(x)) + fiT (5(y)) Vjc, y e U , T oSisa linear transforma- 
tion. □ 

We now give an alternative definition of a linear isomorphism. 

Definition 8.5.4 Let U and V be vector spaces over F. A linear transformation 
T : U — >► V is said to a (linear) isomorphism iff there exists a linear transformation 
S : V — > 1/ such that S' o T = Iu (identity map) and T o S = ly . In particular, a linear 
operator T : V — >► V is said to be non-singular or invertible iff T is an isomorphism. 

We now characterize non-singular transformations. 

Proposition 8.5.4 Let T : V V be a linear operator. Then the following state- 
ments are equivalent : 

(a) T is non-singular ; 

(b) nullity of T = 0; 

(c) rank of T = dim V. 

Proof Left as an exercise. □ 

Theorem 8.5.7 Let U and V be finite dimensional vector spaces over the same 
field F. If dim U = m and dim V = n, then dim (L(U, V )) = mn. 

Proof Let W be the vector space L(U, V) over F. Suppose Bjj = {u\, U 2 , . . . , u m } 
is a basis of U and By = {ui , V 2 , . . . , v n } is a basis of V. Define a family of linear 
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transformations \ fa : U — > V by 

t jfijiui) = Vj, for i = 1, 2, . . . , m; j = 1, 2, . . . , n 


and 


fiij(uk) = 0 , for k 7 ^ i. 

Then fiij maps w z - into vj and other w’s into 0. 

Clearly, {V'7/} consisting of mn elements constitutes a basis of W. Consequently, 
dim W = mn. □ 

Corollary 1 If V is a finite dimensional vector space over F of dimension n , then 
L(V, V ) is also a finite dimensional vector space over F of dimension n 2 . 

Corollary 2 IfVis an n -dimensional vector space over F , then L(V, F) is an also 
an n -dimensional vector space over F . 

Let V be a vector space over F. Then linear transformations T \V — > F lead to 
some extremely important concepts and yield results which are not true in general 
setting. 

Definition 8.5.5 For any vector space V over F , the vector space L(V, F) is called 
is dual space denoted by V d and an element of V d is called a linear functional of V 
into F . 


Remark A linear functional transforms a vector to a scalar. If V is an n -dimensional 
vector space, then dim(V^) = n. 

Example 8.5.3 (i) Let V = C([0, 1]) be the vector space of all real valued con- 
tinuous functions in v on [0, 1]. Then : V — >► R, f(x) (-> /J f(x) dx is a linear 
functional and hence e V d . 

(ii) Let V = R[v] be the vector space (infinite dimensional) of all real polynomi- 
als in v. Then for a e R, D a : V R, / t-> f'(a) is a linear functional. 

We now consider only finite dimensional vector spaces V , because if V is not 
finite dimensional, then V d is too large to invite attention unless an additional struc- 
ture, such as topology is endowed. 


Theorem 8.5.8 Let V be a finite dimensional vector space over F and B = 
[v\,V 2 , ... ,v n } a basis ofV. Then the linear functionals fu f 2, •••, fn defined by 


fi(vj) = 8ij = 


1 , 

0 , 


ifi = j 
ifi / j 


constitute a basis B d of the dual space V d , called a dual basis of B. 
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Proof Clearly, for a\f\ + 0^2/2 H \~a n fn = 0, 

n 

Y 01 ’ fi (v ‘ ) = °> since +«2^2 H f V„) - at . 

i = 1 

Hence 

n 

Y j 01 i Sij = 0 oti = 0, V/ = 1 , 2, . . . , n . 

i = 1 

Consequently, is linearly independent. Again 
dim V = dim(V J ) = n. 

Hence B d consisting of n linearly independent elements of V d constitutes a basis 
of V d . □ 

Remark <$q- is called Kronecker delta. 

Proposition 8.5.5 If V is a finite dimensional vector space over F and v (7^0) 
in V , then there exists a linear functional f e V d such that f(v) 0. 

Proof Suppose v / 0. Then it shows that } is linearly independent in V. Let dim V 
be n. We can construct a basis B = {t; = iq, V 2 , . . . , v n J for V. Then for the ele- 
ments 1, 0, 0, . . . , 0 in F , there exists a unique linear transformation / : V — > F 
such that f(v) = 1, f(vi) = 0, i = 2, 3, . . . , n. This implies that / e V d such that 

0. □ 

Example 8.5.4 Let V = R 3 , tq = (1, —1, 3), u 2 = (0, 1, —1) and v 3 = (0, 3, —2). 

Then B = {tq, u 2 , tq} is a basis of V. Its dual basis B d = {/1, /2, ^3} given by 

fi(x, y, z) = oq* + Pi y + yiz 

fi(x, y, z) = (*2* + Piy + K2Z > (8.2) 

/ 3 (x, y, z) = <x 3 x + fi 3 y + ]/ 3 z 

is such that f\(v\) = 1, /i(v 2 ) = 0 and /i(tq) = 0; 

/2(vi) = 0, flivi) = 1 and / 2 (u 3 ) = 0; 

/ 3 (tq) = 0, f 3 (u 2 ) — 0 and / 3 (t; 3 ) = 1. 

Consequently, /i(x, y, z) = v, / 2 (v, y, z) = lx — 2y - 3z and / 3 (x, y, z) = —2x + 
y + z. Hence V d = L(B d ). 

We now introduce the concept of annihilators in vector spaces V which is closely 
related to dual spaces V d . 
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Definition 8.5.6 Let A be a non-empty subset of a vector space V . Then the anni- 
hilator A (A ) of X is defined by 

A(X)= {/ e V d : f(x)=0 for all x e X} , 

= {feV d :f(X) = 0}. 

A (A) has the following properties. 

Proposition 8.5.6 Let X be any subspace of a finite dimensional vector space V 
over F. Then A (A) is a subspace ofV d such that dim(A(A)) = dim V — dim A. 

Proof Clearly, A (A) is a subspace of V d . Let dim V = n , dim A = m, m < n and 
{u i, U 2 , . . . , w m } be a basis of X. Then this basis can be extended to a basis B = 
{u\,U 2 ,...,u m ,v\,V 2 ,...,v n -m} for V. Let B d = {f\, f 2 , . . . , f m ,g\, g 2 , ■ ■ ■ , 
gn-m) be the dual basis for B. If C = [gi, g 2 , • • • , gn-m}> then C is a basis of A(X). 
Hence dim(A(A)) —n — m = dim V — dim A. □ 

Corollary Let V be a finite dimensional vector space over F and U be a sub space 
ofV. Then the dual space U d is isomorphic to the quotient space V d /A(U). 

Proof Clearly, A(U) is a subspace of V d and dim (V d /A(U)) = dim(V d ) — 
dim (A((/)) = dim U = dim U d . Hence the corollary follows. □ 

An Application of Dual Spaces We now apply the results of dual spaces to obtain 
the number of linearly independent solutions of a system of linear homogeneous 
equations over a field F : 

U\\X\ + 0'12*2 H h 0i\ n x n = 0 

<* 21*1 + <* 22*2 H b a 2n x n = 0 ^ 

<*ml*l “b <*m2*2 ”b • • • + Oi mn X n — 0 

Let S = {ar z = (an,ai 2 , . . . , a/ w ) G F n : i = 1, 2, . . . , m} and C/ = L(5'). Then U is 
a subspace of F n . 

If dim U = r <m, we say that the system of (A) is of rankr. Let V = F n , B = 
{e \ , £ 2 , • • • , e n } be the natural (standard basis) of V and B d = {/i , / 2 , . . . , f n } be its 
dual basis. Then any / e V d can be expressed uniquely as 

f — x\f\ ~b * 2/2 H b* w /n, Xi G F. 

If / G A(C/), then f{u) = 0 for all « G 1/ and hence 0 = f(a\i,a 12 , • • • , «i w ), since 
(<*ll, <* 12 , • • • , oq n ) G I/. Consequently, 

0 = f(a \\e\ + aue 2 H b ai n e n ) 

= x\au H \-x n ai n , since fi(ej) = 8ij. 
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Similar results hold for other equations in (A). Conversely, for every solution 

(xi , JC2, . • • , x n ) of equations in (A), there exists an element x\ f\ -\-x 2 f 2 H Vx n f n 

in A(U). Then the number of independent solutions of the system (A) is dim (A (I/)) 
i.e., n — r. 

This proves the following theorem. 

Theorem 8.5.9 If the system of equations in (A), where otij e F , is <9/ rank r, then 
the number of linearly independent solutions in F n of the system is n — r. 

Corollary If the number of unknowns exceeds the number of equations in (A), then 
there exists a non-trivial solution of the system (A). 


8.5.1 Algebra over a Field 

For any two vector spaces U and V over the same field F , the set L(U,V) admits a 
vector space structure over F (see Proposition 8.5.2). If V = U, we can introduce a 
multiplication on L(V, V ) making it a ring. We can combine in a natural way these 
two twin structures of L(V, V) to provide a very rich structure called an ‘algebra’ 
over F . 

We now give the formal definition of an algebra with an eye to L(V, V). 

Definition 8.5.7 An algebra A. is a vector space over a field F, whose elements can 
be multiplied in such a way that A is also a ring in which the scalar multiplication 
and ring multiplication are interlined by the rule 

A(l) : For all x, y e A and a e F, ot(xy) = (< otx)y = x(ay). 

A commutative algebra A is an algebra such that 

A (2) : xy = yx holds for all x, y in A. 

An identity element in an algebra A is a non-zero element 1 in A such that lx = 
x 1 = x for every x in A. 

Remark If the identity element exists in an algebra, then it is unique. 

We call an algebra A over F real or complex according to F = R or C. 

Definition 8.5.8 A non-empty subset B of an algebra A is called a subalgebra of 
A iff B is itself an algebra under the operations defined in A (i.e., B is a subspace 
of A such that for all x, y in B , xy e B). 

Remark A subalgebra of an algebra is an algebra in its own right. One of the most 
important examples of an algebra is L(V, V). In general, it is a non-commutative 
algebra having zero divisors. 
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We give some other examples. 

Example 8.5.5 (i) R is a commutative real algebra under usual compositions in R. 

(ii) The real vector space C([0, 1]) is a commutative real algebra with identity. 

(iii) The vector space B (X, C) of all bounded complex valued functions on a 
topological space X is a commutative complex algebra. 

(iv) Let A be an algebra over F and B = {x e A : vy = yx for all y e A}. Then 
B is a subalgebra of A, called the center of A. 

Definition 8.5.9 Let A be an algebra over F . An algebra ideal (or an ideal) I of A 
is a non-empty subset of A such that I is a subspace of A (when A is considered as 
a vector space) and also a ring ideal of A (when A is considered as a ring). For any 
ideal I of A, All is an algebra, called the quotient algebra of A modulo I . 

Like ring homomorphisms, homomorphisms are defined in algebra. 

Definition 8.5.10 Let A and B be algebras over the same field F . A homomor- 
phism xj/ : A -> B is a mapping such that 

x//(x + y) = xjr{x) + x//(y); 
f(xy) = 1r(x)\lr(y)\ 
x//(ax) = axj/(x), 


for all v , y in A and a e F. 

Moreover, if the homomorphism ^ is a bijection, then xf is said to be an isomor- 
phism and A is said to be isomorphic to B. 

Remark An algebra homomorphism / is a ring homomorphism such that / pre- 
serves the vectors space structure. 

Proposition 8.5.7 Let x[/ : A — > B be a homomorphism of algebras. Then ken/f is 
a subalgebra of A and Im x/s is a subalgebra of B. 

Proof Left as an exercise. □ 

We now prove a theorem for an algebra which is the analog of Cayley’s Theorem 
for a group. 

Theorem 8.5.10 Let A be an algebra with 1, over F . Then A is isomorphic to a 
subalgebra of L(V, V) for some vector space V over F. 

Proof A being an algebra over F, it is a vector space over F. We take V = A to 
prove the theorem. For x £ A, we define the map T x : A — > A by T x (v) = xv, 
Vi> g A. Then T x is a linear transformation on V = A and the map xj/ : A ^ 
L(V, V), x i-> T x is well defined. Moreover, xfr is both a linear transformation of 
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vector spaces and a homomorphism of rings. Hence \jr is a homomorphism of al- 
gebras and ker xfr = {x e A : xjf(x) = 0} = {x e A : T x = 0} = [x e A : T x (v) = 
0, Vv e V}. Since V — A has identity 1, T X (Y) = 0 =>► v = 0. This shows that 
ker i/s = {0} and hence x// is a monomorphism of algebras. Consequently, A is iso- 
morphic to the subalgebra Imi/f of L(V, V) over F. □ 

Remark The algebra L(V,V) plays a universal role to obtain isomorphic copies of 
any algebra. 

We now study the algebra L(V, V), where V is restricted to finite dimensional 
vector spaces over F . 

Definition 8.5.11 A non-zero element T in L(V, V) is said to be invertible iff there 
is an element S in L(V, V) such that ToS = SoT = F Otherwise, T is said to be 
singular. 

Clearly, if T is invertible, S is unique and it is denoted by T ~ l . 

Theorem 8.5.11 Let A be an algebra over F , with 1, such that dim A = n. Then 
every element in A is a root of some non-trivial polynomial f(x) in F[x] such that 
deg f <n. 

Proof For v £ A, the (n + 1) elements 1, v, v 2 , . . . , v n are in A. Since dim^4 = n , 
these (n + 1) elements must be linearly dependent over F. Hence there exist (n + 1) 

elements ao, a\ , . . . , a n in F (not all 0) such that aol + a\v H \-a n v n =0. This 

shows that v is a root of the non-trivial polynomial f{x) =<20 + a\x + f- a n x n 

in F[x] such that deg f <n. □ 

Corollary Let V be an n-dimensional vector space over F . Then given a non-zero 
element T in L(V, V ), there exists a non-trivial polynomial f(x) in F[x] such that 
deg f <n 2 and f(T) = 0. 

Proof For the algebra A = L(V, V) over F , dim *4 = n 2 . Clearly, A contains the 
identity, which is the identity operator on V. The corollary follows from Theo- 
rem 8.5.11. □ 

The corollary leads to the following definition. 

Definition 8.5.12 Let V be an n -dimensional vector space over F and T be a non- 
zero element in L( V, V). Then there exists a non-trivial polynomial m(x) of lowest 
degree with leading coefficient 1 in F[x] such that m(T) = 0. We call m(x) minimal 
polynomial for T over F . 

Remark If T satisfies a minimal polynomial g(x) in F[x ], then this m(x) divides 
g(x) in F[x]. Since g(v) is monic and g(T) = 0, m(x) = g(x). This shows the 
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uniqueness of a monic polynomial for T over F and we use the term the minimal 
polynomial for T over F . 

We now characterize invertible elements in L(V, V). 

Theorem 8.5.12 Let V be an n -dimensional vector space over F and T a non- 
zero element in L(V , V). Then T is invertible iff the constant term of the minimal 
polynomial for T over F is not zero. 

Proof Let f(x) = ao + a\x + h a m -\x m ~ l + x m £ F[x] be the minimal poly- 

nomial for T. Then 0 = f(T) = a 0 I + a x T + • • • + a m - X T m ~ l + T m . 

If ao / 0, then aol = + a m -\T m ~ 2 + ••• + a\I)T . Hence I = 

+ a m -i T m ~ 2 + ••• + ai I)]T. If we take 5 = -a~ x {T m - 1 + 
a m -\ T m ~ 2 + • • • + ail), then S o T = I. Similarly, T o S = I. Consequently, T 
is invertible. Conversely, let T be invertible in L(V, V). If possible, ao = 0, then 
0 = (a\I + + • • • + a m -\T m ~ 2 + p m ~ l )T . Multiplying by T ~ l , we have 

a\I + a 2 T + • • • + a m -\T m ~ 2 + p m ~ l = 0. Hence T satisfies the polynomial 
q(x) = a\ + a^x + • • • + a m -\x m ~ 2 + x m ~ l of degree (m — 1) in F[x]. But this 
is not possible as the minimal polynomial f(x) is of degree m. This leads us to 
conclude that ao / 0. □ 

Corollary Let V be a finite dimensional vector space over F and T e L(V, V) be 
an invertible element. Then its inverse T~ l is also a polynomial in T over F . 


8.6 Correspondence Between Linear Transformations 
and Matrices 

In this section linear algebra starts to encompass the isomorphic theory of matrices. 

Linear transformations and matrices are closely related. In this section we es- 
tablish their relations and study matrices with the help of linear transformations 
and vice versa. Let T : R 2 — >► R 2 be the rotation of the plane through an an- 
gle 6 about the origin defined by T (x , y) = (x cos 6 — y sin 0 , v sin 0 + y cos 0) and 
B = {e i = (1,0), e 2 = (0, 1)} be the natural basis or standard basis of R 2 . Then the 
actions of T on e \ , e 2 are given by 

r((l, 0)) = (cos 0 , sin0) = cos $ (1 , 0) + sin0(O, 1) 


and 

r((0, 1)) = (- sin 0, cos 0) = - sin (9(1, 0) + cos (9(0, 1). 



Then the 2x2 matrix 


cos 0 sin 6 
— sin 0 cos 6 
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is called the matrix of coefficients associated with T and its transpose 

A t — ( cos ^ — sin 0 

~ ysin# cosO 

is called the matrix representation of T with respect to the given basis B . We now 
extend this concept to an arbitrary finite dimensional vector space. 


Remark We also use the notation of the same matrix A as 


A = 


cos 6 sin 6 
— sin 6 cos 0 


Definition 8.6.1 Let V and W be finite dimensional vector spaces over F 
and T : V — > W a linear transformation. Let B\ = {v\, V 2 , . . . , v m ] and B 2 = 
{w\, W 2 , . . . , w n } be their respective ordered bases. For each /, 1 <i <m, T(vi) 
in W can be expressed uniquely as T(v{) = Y^j=\ a ij w j> a ij G F. The matrix 
A = (atj) is called the matrix of coefficients associated with T and its transpose 
A 1 is called the matrix representation of T with respect to given bases B\ and B 2 . 

The matrix A f is denoted by m(T)^ or simply by m(T). In particular, for V = W 
and T — I (identity), the matrix m(I)\ ^ is called the transition matrix from B\ 
to B 2 . Again if V = W and B\ = B 2 = B, then m(T)^ 2 is simply written by m(T) b . 

Remark Some authors call A the matrix representation of T . The matrix m(T) de- 
pends not only on T but also depends on the ordered bases B\ and B 2 . 


Definition 8.6.2 Let o : {1 , 2, . . . , n] — > {1 , 2, . . . , n} be a permutation and {et } be 
the natural basis B of R n . Let f a : R n — > R n defined by f G (et) = e 0 (j) be extended 
linearly over R n . The matrix m (f G )b is called the permutation matrix corresponding 
to a . 


Example 8.6.1 (i) If I : R n — >► R n is the identity transformation and B = [et } is the 
natural basis of R n , then m(I)s — 1 (identity matrix). 

(ii) Let V = V3 [x] be the vector space of polynomials in v over R, of degree 
less than 3 and D : V — >► V, f f' be the differential operator. For the basis B = 

{l,x,x 2 }, 

(° 1 °\ 

m(D)b = 0 0 2 . 

\o 0 0/ 

(iii) Let T : R 2 -* R 2 be the linear transformation defined by T(x, y) = (2x + 
y,x- y). If B = {(1, 0), (0, 1)} and C = {(1, 1), (1, -1)}, then 


and 


m(T) c = 


3 

2 

3 
2 



m(T ) 5 = 


2 1 
1 -1 
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(iv) The permutation matrix corresponding to 


a = 


1 2 3 

3 1 2 


is 


0 1 0 
0 0 1 
1 0 0 


This is so because by Definition 8.6.2, f a (e\) = e 0 {\) = = 0 • e\ + 0 • ^2 + 1 • £ 3 - 

Similarly, f a ( e 2 ) = e a(2 ) = e\ = 1 • e\ + 0 • e 2 + 0 • ^3 and f G (e 3 ) = ^(3) = ^2 = 
0 • + 1 • e 2 + 0 • ^3. 

Given an m x n matrix M over a field F , we can find a linear transformation 
T :F n -+ F m such that m(T) = M. 

Example 8.6.2 Suppose 



Then the column vectors u = (1 , — 1 , 0) and v = (2, 4, 5) define a linear transforma- 
tion T : R 2 — >► R 3 by 


T(x,y) = xu + yv = x(l, — 1, 0) + y(2, 4, 5) 

= (x + 2y, —x + 4y, 5y) for all (x, y) in R 2 . 


If B is the natural basis of R 2 , then T ( e\ ) = T ((1, 0)) = (1, — 1, 0) and T (ei) — 
T ((0, 1)) = (2, 4, 5). Hence m(T) B = M. 

The following proposition is a generalization of Example 8.6.2. 

Proposition 8.6.1 Given an m x n matrix M over R, the map Tm '• R n ^ 
R m , X MX, is a linear transformation where X is viewed as a column vector 
ofW. 

Conversely , if B = {E\, E 2 , E n } is a set of unit column vectors of K n , given 
a linear transformer T : R" — >► R m such that T ( Ej ) = Mj, where Mj is a column 
vector in R m , then M is the matrix where column vectors are M \ , M 2 , . . . , M n . 

Proof Clearly, Tm is a linear transformation. Conversely, we suppose 


T(Ej) = 



\ a mj / 
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Then we can write every X in R n as 


X — x\E\ + • • • + x n E n 




\Xn / 


Hence T ( X ) = x\ M\ + X2M2 + • • • + x n M n = MX , where M is the matrix where 
column vectors are M 1, M2, . . . , M n . Consequently, T ( X ) = MX = Tm(X) VX g 
W^T = T m . □ 


Proposition 8.6.2 Let V aad IT /zmte dimensional vector spaces over F. Sup- 
pose dim V = n and dim W = m. If B\ = {e\, e2 , . . . , aad .62 = {/1, /2, • • • , /ml 
are ordered bases of V and W respectively , t/zea t/te mapping : L(V, IT) — > 
T \-> m(T ) w a bijection and also an isomorphism of vector spaces. 

Proof Let M e M mn (F). Then there exists a matrix A = (aij) nxm such that 
M = A* . Suppose yi = a ijfj> 1 — z Then y z - e IT and there exists 

a unique linear transformation T : V ^ W such that T (e z ) = yi = YTj=\ a ijf j* 
Hence m(T ) = A* = M. This shows xf is onto. Let T , S' be in L(V, IT) such 
that m(L) = m(S) = A r , where A = (a ZJ ). If YTj=\ a ij f j = )>; , 1 < / < a, then 
r(^) = YTj=\ a ijfj — yi = S (G ) f° r all * = 1,2, ...,w. Hence it follows that 
S = T. This shows that xf is injective. Consequently, xjs is a bijection which is clearly 
an isomorphism of vector spaces. □ 

Corollary 1 Let U, V, W be finite dimensional vector spaces over F . Given or- 
dered bases B i, B2 and B 3 ofU, V , aad IT respectively and linear transformations 
Ti:U ^V,T 2 :V -+W, m(T 2 o T x ) = m(J 2 ) • m(7j). 

Corollary 2 Let V be a finite dimensional vector space and B be a fixed ordered 
basis ofV.A non-zero linear operator T : V ^ V is non-singular ( i.e ., invertible) 
iff its corresponding matrix m(T) is non-singular. 

Given two bases B x and B2 of a finite dimensional vector space V and a lin- 
ear operator T : V — > V, the corresponding twin matrices m(T)5 1 and m(T)B 2 are 
closely related. This introduces the concept of similar matrices. 

Definition 8.6.3 Let A, B be two matrices in M nn (F). Then A is said to be similar 
to B , denoted by A ~ B iff there exists a non-singular matrix P in M nn (F) such 
that B = PAP~ l . 


Remark Similarity relation on the set M nn (F) is an equivalence relation. We now 
characterize similar matrices with the help of linear operators. 

Theorem 8.6.1 Two n x n matrices over a field are similar iff they represent the 
same linear operator ( each relative to a single basis). 
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Proof Let V be an ft -dimensional vector space over a field F and B i, B 2 be two 
bases of V. Suppose A and B are two n x n matrices represented by the same 
linear transformation T : V -> V with respect to bases B\ and B 2 , respectively. 
We claim that A ~ B. Let Pi = {it, i> 2 , . . . , u w } and £2 = {wi, U 2 , . . . , ft n } be two 
bases of V. If T (vf) = YTj = 1 a ji v j an d L (zfi) = ^ 7=1 bji u j, where < 2 ^, bji e P 
and 1 < i, ./' < n, then A = ( < 277 ) and 5 = (£ 77 ). Again since Pi and £2 are both 
bases of V, we can represent ft; as Ui = YTj=\ c ji v j an d v i as v i = YTj = 1 dp u h 
1 < i < ft- If C = (C 77 ) and D = (d/ 77 ), then 

n / n \ n / n \ « 

= L ) = X( X ^ ^ => X = 1 • ir ' = k 

j = 1 U =1 / Jt=l\y=l / 7=1 

= 0 , if i ± k. 

This shows that CD = I . Similarly, DC = I . Hence C is a non-singular matrix with 
its inverse D , i.e., C _1 = D. Clearly, BC = CA and hence 5 = CAC -1 =>► A ~ 
Conversely, let A and 5 be two n x ft similar matrices over P and T : V — > V 
be a linear operator of a finite-dimensional vector space V such that A = m(T)B l 
for some ordered basis B\ = {v\, V 2 , . . . , v n } of V. We now show that there exists 
an ordered basis B 2 of V such that B = m(T)B 2 - As B is similar to A, there ex- 
ists a non-singular matrix P such that B = PAP -1 . If A = ( 727 /), then T (A/) = 
YTj=\ a ji v j’ 1 — * — Let 5 = (fyj) and P = ( 7277 ). We define a linear trans- 
formation S : V -> V, Vi i-> J ]" =1 PjiVj , 1 < i < ft. Then P = 7 ^( 5 )^. As P is 
non- singular, the set P 2 = {P(ui), S(v 2 ), . . . , S^)} is linearly independent. Hence 
for di e P, the relation £? =1 = 0 => S(E? = i 4 «i) = 0 =► £? = i 4 «i = 0, 

since S' is injective. This shows that each <7/ = 0. Hence B 2 also forms a basis of V. 
Let m(T)B 2 = Af. Then by the first part, M — PAP -1 = P. □ 

Remark The matrix P is not unique (see Example 8.6.3). 

We now define another relation on M nn (F). 

Definition 8.6.4 Let A, B be in M nn (F). Then A is said to be equivalent to B , 
denoted A ~ B iff there exist non-singular matrices P, Q in M nn (F ) such that P = 
P AQ. 

Remark Similar matrices are equivalent but its converse is not true (see Exam- 
ple 8.6.3). 

Example 8.6.3 (a) Let T : R 2 — >► R 2 be defined by T(x, y) = (x — y, y — x) and 
Bi = {(1,0), (0, 1)} and B 2 = {( 1 , 2 ), (- 1 , 1 )}. Then m(T) Bl = A = (_\ ~ l ) and 

m(T)B 2 = P = (1 2 ) i m Ply ^at A ~ B. If P = ( ^ j) and 2 = (2 _ 1 3 ) ? then B — 
PAP -1 = QAQ~ l implies that P is not unique. 

(b) The matrices A = ( * ) and B = ( 1 ) are equivalent matrices but not similar. 
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Remark An important application of equivalent matrices is in their Smith normal 
form (see Ex. 21 of Exercises-II). 

We now establish a natural correspondence between linear transformations and 
matrices which assigns to the sum and composite of two linear transformations the 
sum and product of two matrices, respectively. 

Theorem 8.6.2 Let V be an n -dimensional vector space over F . Then the two 
algebras Lp(V, V) of all linear operators on V over F and M nn (F) of all n x n 
matrices over F are isomorphic. 

Proof Let B = {v\,v 2 , . . . , v n } be a basis of V over F and T e Lp(V, V). If m(T) 
is the matrix representation of T with respect to the basis B , then the mapping 
: Lp(V, V) -> M nn (F), T i-> m(T) is an isomorphism of algebras. □ 

Corollary 1 A non-zero element in Lp(V, V) is invertible iffm(T) has an inverse 
in M nn (F ) i.e., ijfm(T) is non-singular. 

Corollary 2 The group L n (F ) of all non-singular linear operators (i.e., the group 
of all invertible elements) in Lp(V, V) is isomorphic to the group GL(n, F) of all 
non-singular matrices in M nn (F). 

Definition 8.6.5 The group L n (F) is called a full linear group and the group 
GL(n, F) is called a general linear group. 

Similar matrices are closely related to their ranks and determinants. 

Definition 8.6.6 Let A be an m x n matrix over F and V = R (A) be the subspace 
of F n generated by the row vectors of A. The dimension of V is called the row rank 
of A. If U = C(A) is the subspace of F m generated by the column vectors of A, 
then dim U is called the column rank of A. 

Proposition 8.6.3 Let A be an m x n matrix over F . Then 

(i) row rank of A < min {m, n}; 

(ii) column rank of A < min {m, n}; 

(iii) row rank of A = column rank of A. 

Proof (i) Let B = {R \ , R 2 , . . . , R m } be the set of all m rows of A. Then B generates 
the row space /?(A) of A. Hence dim(/?(A)) < m. Again /?(A) is a subspace of F n . 
Hence dim(7?(A)) < n. This proves (i). 

(ii) Similar to (i). 

(iii) Let row rank of A = r and the following r row vectors of A form a basis 
of R(A): 


Bi = (b\\,b\2, bi n ), 


8.6 Correspondence Between Linear Transformations and Matrices 


309 


#2 = 0>2l,b22, • • • , fa n), 

B r = (b r i, b r 2, . . . , b rn ). 

Then each row vector of A can be expressed uniquely as 

R\ = c\\B\ + C 12 B 2 H b c\ r B r , 

Rl = ^21 B\ + ^22^2 H b C2 r B r , 


R/n — Cm\B\ ~b C m 2^2 "b * * * ~b Cmr^r? where Cfj £ F. 
Equating the jth components from the above relations, we get 

d\ j = C\\b\j + C\2b2j "b • • • ~b C\ r b r j, 

&2j = C2\b\j + C22^2; H b C2 rfaj, 


ttmj — Cm\b\j + C m 2faj ”b * * * ”b C mr b r j. 

The above relations show that 


Cl\ j \ 

a 2j 

= hj 

cn \ 

C 21 

+ b 2j 

cn \ 
c 22 

H 1 -brj 

(Clr\ 

C 2r 

\ a mj ) 


\ c ml / 


\ c m2 / 


\Cmr J 


Thus the column space of the matrix A has dimension at most r i.e., column rank 
of A < r = row rank of A. Similarly, row rank of A < column rank of A. This 
proves (iii). □ 

Definition 8.6.7 Let A be an m x n matrix over F. Then row rank of A = column 
rank of A = r. This common value is called the rank of A, denoted by rank A. 

Proposition 8.6.4 Let A and B be two similar matrices of order n over F . Then 

(i) rank A = rank B ; 

(ii) detA = det£. 

Proof A~B shows that there exists a non-singular matrix P in M nn (F ) such that 
B = PAP"* Then 

(i) rank B = rank (P AP~ l ) = rank A and 

(ii) det# = det(PAP~ l ) = det A. □ 
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8.7 Eigenvalues and Eigenvectors 

The algebra of matrices can be applied smoothly to diagonal matrices. The reason 
is that for the addition and multiplication of any two diagonal matrices, one simply 
adds (or multiplies) the corresponding diagonal entries. So it has become important 
to know which matrices are similar to diagonal matrices and which pairs of diago- 
nal matrices are similar to each other. To obtain the answer to these questions, the 
concepts of eigenvalue and eigenvector are introduced. 

Let V be an n -dimensional vector space over F and T : V -> V a linear operator. 
If B\ is a basis of V and m(T)s l = A, then by Theorem 8.6.1, A is similar to a 
diagonal matrix iff there exists a basis Z? 2 = {f l, v 2 , • • • , v n ] of V such that m{T)s 2 
is a diagonal matrix, i.e., iff T (u;) = kjVi for some A/ e F, i = 1, 2, . . . , n. Hence 
any non-zero vector v e V for which T(v)= kv for some k e F plays an important 
role in our study and leads to the following concepts. 

Definition 8.7.1 Let V be an n -dimensional vector space over F and T : V -> V 
a linear operator. An element k in F is called an eigenvalue of T (or characteristic 
root of T) iff there exists a non-zero vector v e V such that T ( v ) = kv. This vector 
v (if it exists) is called an eigenvector of T corresponding to the eigenvalue k. For a 
square matrix we have an analogue of this definition. 

Definition 8.7.2 Let A be a square matrix of order n over F . An element k in F 
is called an eigenvalue of A iff there exists a non-zero vector X = (x\, x 2 , . . . , x n ) 
in F n such that AX = kX (or IA = kX). The vector X (if it exists) is called an 
eigenvector of A corresponding to the eigenvalue k. 

Remark 1 If A is an n x n matrix over R, then AX = kX shows that AX dilates 
(lengthens) X if \k\ > 1 and contracts (shortens) X if \k\ < 1, in R". 

Remark 2 Let T : K n — >► R^ be a linear operator and m(T)s = A, where B is the 
natural basis of R^. Then for X = (x\, jc 2 , . . . , x n ) e R n , T (A) = XA. This shows 
that k is an eigenvalue of A iff k is an eigenvalue of T and X is the corresponding 
eigenvector of both T and A. 

Remark 3 If v is an eigenvector of T corresponding to an eigenvalue k e F, then 
for any non-zero scalar a in F, av is also an eigenvector of T corresponding to the 
same eigenvalue k. Hence there exist many eigenvectors of T (or A) corresponding 
to an eigenvalue. 

The eigenvectors have many interesting properties. 

Proposition 8.7.1 Let A be an n x n matrix over F and E(k) be the set of all 
eigenvectors of A corresponding to an eigenvalue k , together with the zero vector. 
Then E(k) forms a subspace of the vector space F n over F . 


8.7 Eigenvalues and Eigenvectors 


311 


Proof Clearly, E(X) = {X e F n : AX = XX} / 0. Then for X,Y e E(X) and 
e F, A(aX + /3Y) = X(aX + /3Y) e E(X) =► £(A) is a subspace of F n 
over F. □ 

Definition 8.7.3 The vector space E (A) is called the eigenspace of A correspond- 
ing to the eigenvalue X. For a linear operator T : V — > V and an eigenvalue A, 
F(A) = )dg V \ T (v) = Xv} is a. subspace of V , called the null space of T — XI and 
the number of linearly independent eigenvectors in E (A) is called the dimension 
of E(X). 

Remark If Ai and A 2 are two distinct eigenvalues of T, then E( Ai) IT E(Xf) = {0} 
i.e., zero vector is the only vector common in the corresponding eigenspaces. 

Theorem 8.7.1 Let A be an n x n matrix over F and X e F . Then X is an eigenvalue 
of A iff the matrix (A — XI) is singular. 

Proof Let B = A — XI . Then A is an eigenvalue of A O AX = XX for some non- 
zero X e F n (A — XI)X = 0 the homogeneous system of equations BX = 0 
has a non-trivial solution the solution space of the system BX = 0 is non-zero 
and hence has dimension >0. If rank B = r , then n — r > 0 ^ n > r. This situation 
occurs iff B is singular. □ 

Corollary A is an eigenvalue of A iffdet(A — XI) = 0. 

Proof X is an eigenvalue of A O the matrix (A — XI) is singular det(A — 
XI) = 0. □ 

We now consider the polynomial det(A — xl) in x of degree n over F . 

Definition 8.7.4 Let A be an n x n matrix over E . Then the matrix (A — x I) is 
called the characteristic matrix of A and det(A — x I) is called the characteristic 
polynomial of A denoted by ka(x). The equation ka(x) = 0 is called the character- 
istic equation of A. 

Remark The eigenvalues of A are the roots of its characteristics polynomial. 

Example 8. 7 . 1 For the matrix A = ( ^ 4 ) 

(i) the characteristic matrix is A — xl = ( 1 “ x 4 ^ x ) ; 

(ii) the characteristic polynomial is ka(x) = x 2 — 5x + 2; 

(iii) the characteristic equation isx 2 — 5v + 2 = 0. 


Proposition 8.7.2 Similar matrices have the same eigenvalues. 
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Proof Suppose A ~ B. Then there exists a non-singular matrix C such that B = 
CAC~ l => B -xl = CAC ~ 1 -xCIC- 1 = C(A-xI)C~ l =>• A -xl « B-xI 
det(A — x I) = det (B — xl) => A and B have the same eigenvalues. □ 

Remark A matrix over R may not have an eigenvalue but a matrix over C has always 
an eigenvalue. This is so because a polynomial over R may not have a root in R 
but a polynomial over C has always a root in C [see the Fundamental Theorem of 
Algebra, Chap. 11 of the book or Theorem 11, Herstein (1964, p. 337)]. 

Proposition 8.7.3 Let V be an n-dimensional vector space over a field F and 
T : V -> V be a linear operator. Then the eigenvalues of T are the eigenvalues of 
any one of its matrix representation ( relative to a single basis B). 

Proof Let m(T)B = A , where B is the natural basis of F n . Then for X = 
(x\, X 2 , . . . , x n ) e F n , T ( X ) = XA. Hence it follows that A is an eigenvector of 
A A is an eigenvalue of T. The proof is completed by using Proposition 8.7.2. □ 

Example 8.7.2 (i) Let T : R 2 —> R 2 , (x, y) (x, — y). Then T is a reflection about 
the x-axis and every multiple of e\ — (1,0) is mapped onto itself by T. Hence 
T ( X ) = 1 • X for every X = ae\, aeR implies that A = 1 is an eigenvalue of T 
and the corresponding eigenspace E( 1) is the x-axis. Similarly, for every multiple 
Y = fie 2 of e 2 = (0, 1), f e R, T (Y) = — 1 • Y implies that —1 is another eigenvalue 
of T and the corresponding eigenspace E(— 1) is the y-axis. Thus geometrically it 
follows that 1 and — 1 are the only eigenvalues of T . 

(ii) Let T : R 2 R 2 be the rotation of the plane through an angle 0, 0 < 0 < n . 
Then T has no eigenvalue. This is so because XX (A e R) is a scalar multiple of 
X and no vector X except (0, 0) is mapped into a scalar multiple of itself. Hence 
T ( X ) ^ XX unless X = (0, 0) shows that there is no eigenvalue of T and thus there 
is no eigenvector of T . 

(iii) Let V = D[x] be the vector space of real valued functions on R with deriva- 
tives of all orders. If A (^0) e R, /(x) = e Xx and D . V ^ V is the differential 
operator, then Z)(/(x)) = Xe kx => D(f) = A / => X is an eigenvalue of D. Since 
the solutions of the differential equation y ' = Ay are of the form y = ae Xx , a e R, 
the eigenspace E( A) is 1 -dimensional with basis {/}. 

(iv) Let 

5 0 0\ 

0 2 2 and X = (1 0 0). 

0 7 1/ 

Then XA = 5A=^5isan eigenvalue of A and X is the corresponding eigenvector. 

We now prove some other interesting properties of eigenvalues and eigenvectors. 

Proposition 8.7.4 If A is a non-singular matrix of order n over F and X ^ 0 is an 
eigenvalue of A, then A -1 is an eigenvalue of A~ l . 
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Proof A is non-singular (by hypothesis) => det A 0. Then 

det(A _1 - X" 1 /) = AT" det(AA _1 - /) = k~ n ■ det - '■ — — — 

_ , detCACAA- 1 - /)) _ ; det(A7 — A) _ Q 
det A det A 

since X is an eigenvalue of A. This implies that A. -1 is an eigenvalue of A -1 . □ 

Proposition 8.7.5 (Uniqueness of eigenvalue) Given an eigenvector of a square 
matrix A, the corresponding eigenvalue of A is unique. 

Proof Let X be an eigenvector of A and X\,X 2 be its corresponding eigenvalues. 
Then AX = X x X = X 2 X => (X\ - X 2 )X = 0^Xi=X 2 , since X^O. □ 

Theorem 8.7.2 Let A he an n x n matrix over F and X\, X 2 , . . . , X r be distinct 
eigenvalues of A in F , with corresponding eigenvectors X\,X 2 , . . . , X r , respec- 
tively. Then the set S = {X X 2 , . . . , X r } is linearly independent. 


Proof Let V = F n . Then Xj e F n and AXi = X[ X) for i = 1,2 , ... ,r. We prove 
the theorem by induction on r. If r = 1, then X\ ^ {0} and hence it is linearly 
independent. We assume that the statement is true for m <r. 

Then 


ai*i + ei 2 X 2 + • • • + oi m X m + ot m -\- iX m +i = 0 

=> A (a \X\ + a 2 X 2 + • • • + a m X m + a m +iX m +i) = 0 

=>• cc\X\X\ + a 2 X 2 X 2 + • • • + a m X m X m + a m +\Xm+\X m +\ = 0 

i (A. i X m -\-\)X\ + Oi 2 {X 2 ^m+l)^f2 H 1“ Wmifm ^m+l)^fm = 0 

=>► a m+ i = 0 by induction hypothesis, since X m+ \ / 0. 

Consequently, the set S is linearly independent. □ 

Corollary An n x n matrix over F has at most n distinct eigenvalues. 

Proof Since dim F n over F is n, the corollary follows from Theorem 8 . 7 . 2 . □ 

Theorem 8.7.3 Let V be an n-dimensional vector space over F and T : V -> V 
be a linear operator with r distinct eigenvalues X\ , X 2 , . . . , X r and X \ , X 2 , . . . , X r 
the corresponding eigen vectors. Then the set S = {X\, X 2 , . . . , X r ) is linearly in- 
dependent. 

Proof Similar to Theorem 8 . 7 . 2 . □ 


Proposition 8.7.6 Let Abe an nxn matrix over F .If A has n distinct eigenvalues , 
then A is similar to an n x n diagonal matrix. 
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Proof Let T : F n —> F n be the linear operator such that m(T)s = A, where B is 
the natural basis of F n and k\, k 2 , . . . , k n be n distinct eigenvalues of A with the 
corresponding eigenvectors X\, X 2 , . . . , X n . Then the set S = {X\, X2 , . . . , X n } is 
linearly independent and hence S forms a basis of F n . 

Consequently, 


T(X 1 ) = kiXi = kiXi + 0 • V 2 + • • • + 0 • X n 
T(X 2 ) = k 2 x 2 = 0 • Xi + k 2 X 2 + • • • + 0 • x n 


T ( X n ) — k n X n — 0 • X\ + 0 • X 2 + • • • + k n • X n . 


Hence 


m(T) s = D = 


( Ai 

0 • 

0 \ 

0 

a 2 • 

0 

\o 

0 • 

kfl ) 


A^D 


by Theorem 8.6.1. □ 

Remark Converse of Proposition 8.7.6 is not true. The matrix 

(2 0 0 
A= 0 2 0 
\0 0 3 

has only two distinct eigenvalues 2 and 3 but A is similar to itself, which is a diag- 
onal matrix. 

We recall that an/ixn matrix A over C can be expressed as A = X + i Y, where 
X and Y are matrices over R. Then the matrix A = X — i Y is called the conjugate 
matrix of A and the elements of A are the conjugates of the corresponding elements 
of A. We use the symbol A* to denote the conjugate transpose A r of A. 

An n x n matrix A over C is said to be Hermitian, skew-Hermitian or unitary 
according to A* = A or A* = -A or AA* = A*A = /. 

Proposition 8.7.7 The eigenvalues of a Hermitian matrix are all real. 

Proof Let A be an n x n Hermitian matrix and X be an eigenvector corresponding 
to an eigenvalue A of A. 

Then AX = XX ^ ~AX = XX =>• (AX) 1 = (XX) f =>■ x‘ A 1 = lx' x‘ A = 
lx' It AX = lx' X x‘xx = lx' X x'xa - X) = 0 A = X, since 
X / 0 =>► A is real. □ 

Corollary The eigenvalues of a real symmetric matrix are all real. 
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Proposition 8.7.8 The eigenvalues of a skew-Hermitian matrix are 0 or purely 
imaginary. 

Proof Proceeding as in Proposition 8.7.7, we have A. + X = 0. This shows that X is 


□ 


0 or purely imaginary. 


Corollary The eigenvalues of a real skew -symmetric matrix A are 0 or purely imag- 
inary (0 is the only possible real eigenvalue of A). 

Proposition 8.7.9 Each eigenvalue of a unitary matrix has unit modulus. 


Proof Proceeding as in Proposition 8.7.7 we have X* X = XX(X t X) and hence 
XX = 1, since X* X / 0. This implies that \X\ = 1. □ 


Corollary Each eigenvalue of a real orthogonal matrix A has unit modulus (1 and 
— 1 are the only possible real eigenvalues of A). 

We now introduce the concepts of algebraic multiplicity and geometric multi- 
plicity of an eigenvalue. These concepts are important for our further study. 

Definition 8.7.5 Let A be an n x n matrix over F. The algebraic multiplicity of 
an eigenvalue A. of A is the multiplicity of A. as a root of the characteristic equation 
k a (x) = 0. The geometric multiplicity of A. is the dimension of the eigenspace E( X). 

Definition 8.7.6 An n x n matrix A over F is said to be diagonalizable over F iff 
A is similar to a diagonal matrix D over F. 

Definition 8.7.7 A linear operator T : V V on a finite-dimensional vector space 
over F is said to be diagonalizable over F iff there exists a basis B of V such that 
the matrix m(T)s is diagonalizable. 

Remark 1 An n x n matrix A over C is diagonalizable over C iff for each eigen- 
value A. of A, the geometric and algebraic multiplicities of A. are the same. 

Remark 2 An n x n matrix A over R is diagonalizable over R iff all the eigenvalues 
of A are real and for each eigenvalue of A, the geometric and algebraic multiplicities 
are the same. 

Remark 3 An n x n matrix A over F is diagonalizable iff there exists an n x n 
non-singular matrix P over F such that D = P AP~ l is a diagonal matrix. 

Example 8. 7.3 (i) The matrix 
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is diagonalizable over C but not over R. This is so because A has three distinct 
eigenvalues in C but its eigenvalue 2 is only in R. 

(ii) The linear operator T : R 2 -> R 2 , (a , b) \-^ (b , a) has its matrix represen- 
tation m(T)B with respect to the natural basis B = {(1, 0), (0, 1)} of R 2 given by 
A = m(T)B = ( j q). As 1 and —1 are two distinct eigenvalues of A, A is diagonal- 
izable of R and hence T is diagonizable. 

(iii) Let V be a vector space over F and T : V — > V a linear operator with 
an eigenvalue X. Then for any polynomial f(x) in F[x], f(X) is an eigenvalue 
of f(T). 

To show this, let v (^0) be an eigenvector of T corresponding to the eigen- 
value X. Then T(v) = Xv =>► T (T (v)) = T (Xv) =>- T 2 (v) = X 2 v =>- X 2 is an eigen- 
value of T 2 . In general, T n (v) = X n v for every positive integer n. 

Let 


fix') = ao + a\x H h a n x n g F[x]. 


Then 


/(T) — a^I + a\ T + • • • + a n T n 

=> f(T)( v) = (ao + a\X H f a n X n )v = f(X)v. 


Hence 


f(T)(v) = f(X)v => f(X) 

is an eigenvalue of f(T). For simplicity, sometimes we write f(T) v in place of 
f(T)(v ), unless there is any confusion. 

We now apply the concepts of eigenvalues and eigenvectors to find conditions 
for diagonalization of a given square matrix. 

Theorem 8.7.4 Let V be an n-dimensional vector space over F and T : V — > V be 
a linear operator. Then T is represented by a diagonal matrix with respect to some 
basis iff V has a basis consisting of eigenvectors of T. 


Proof Let T be represented by a diagonal matrix 


(d\ 

o • 

•• o \ 

0 

d 2 • 

0 

U 

0 • 

dfi / 


with respect to a basis B = {v \ , V 2 , . . . , v n ] of V. 

Then T{v\) = d\v\, T (V 2 ) = d n v n . Consequently, u* is an 

eigenvector of T corresponding to the eigenvalue di, i = 1, 2, . . . , n. Thus the basis 
B consists of eigenvectors of T . 
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Conversely, if the eigenvectors v \ , i> 2 , . . . , v n of T form a basis B = {v i , iq, . . . , 
v n } of V , then m (T ) b is a diagonal matrix D of the above form, because T(vj) = 
di Vi , di e F, i = 1 , 2, . . . , n . □ 

We now establish the following equivalent conditions for diagonalization of a 
square matrix over F. 

Theorem 8.7.5 Let V be an n -dimensional vector space over F and T : V —> V 
be a linear operator with m(T)s = A with respect to some basis B of V . Then the 
following statements are equivalent : 

(i) A can be diagonalized ; 

(ii) A is similar to a diagonal matrix ; 

(iii) F n has a basis consisting of eigenvectors of A; 

(iv) V has a basis consisting of eigenvectors ofT. 

Proof (i) =>► (ii) It follows from Definition 8.7.6. 

(ii) => (iii) If A is similar to a diagonal matrix, then there exists a non-singular 
matrix P such that D = P AP~ l is a diagonal matrix. If V j is the j'th column of 
P~\ then the j'th column of AP~ l is AVj. Looking at the j'th column of each 
side of P~ l D = AP ~ l , we have AVj = djV j , for some dj e F. Since P~ l is non- 
singular, each Vj is non-zero and hence each Vj is an eigenvector of A. As {Vj} is 
linearly independent, {Vj} constitutes a basis of F n . 

(iii) =>► (iv) Let m(T)s = A for some basis of V. We assume that F n has a 
basis consisting of eigenvectors of A and iq, V 2 , . . . , v n are the elements of V 
whose coordinate vectors relative to B are the above eigenvectors of A. Then 
B\ = [v\, V 2 , . . . , v n } is a basis of V and each vj is an eigenvector of T. 

(iv) =>► (i) Let B = {iq, iq, . . . , v n ] be a basis of V consisting of eigenvectors 

of T . Then T (vj) = djvj ,dj e F, j = 1, 2, . . . , n. Consequently, T has a diagonal 
representation and hence A is diagonalizable. □ 


8.8 The Cayley-Hamilton Theorem 

Cayley-Hamilton Theorem is a famous theorem in linear algebra and gives a re- 
lation between a square matrix and its characteristic equation. More precisely, this 
theorem proves that every square matrix A satisfies its characteristic equation and 
the minimal polynomial of A divides its characteristic polynomial. This theorem is 
used to evaluate large powers of the matrix A. 

Let A be a square matrix over F and f{x) = ao + a\x + • • • + a n x n be a poly- 
nomial in F[x]. We use the symbol a$I + a\ A + • • • + a n A n to represent the ma- 
trix /(A). The following theorem is important in the study of matrices. 

Theorem 8.8.1 (Cayley-Hamilton Theorem) Every square matrix satisfies its own 
characteristic equation. 
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Proof Let A be an n x n matrix over F and ka(x) be its characteristic polynomial 
in F[x]. We claim that ka(A) = 0. Let ka(x) = det (A — xl) = co-^ + c\x n ~ l + 

V c n and B(x) = Box n ~ l + B\x n ~ 2 + f- B n - i, be the adjoint of the matrix 

(A — x I), where B( is a matrix over F. Then 

(A - xI)B(x) = det (A - x/)7 

(A—xI)(Box n * + B\x n ^ B n —\) = (cqx 11 -\- c\x n ^ c n ^F 

Hence 

A(Box n 1 + B\x n ^ + • • • + — (Box n + B\x n + • • • + 

— CQlx n T c\Ix n • • • H - CyiF 
Equating the coefficients of like powers of x, we have 

-£o = ^o/ 

A^o — B\ = c \ I 
AB\ — B 2 = C 2 I 


AB n — 2 B n — 1 — c n —\I 
AB n —\ — c n I . 

Multiplying the above equations by A n , A n ~ l , . . . , A, /, respectively, and then 
adding we have 

coA n + c\ A n + • • • + c n —\ A + c n I = 0. 

This shows that ka (A) = 0. □ 

Remark The definition of a minimal polynomial of a matrix is similar to that of a 
linear transformation (see Definition 8.5.12). 

Definition 8.8.1 Let A be an n x n non-zero matrix over F . A polynomial m(x) in 
F[x] of least degree with leading coefficient 1 is said to be a minimal polynomial 
of A iff m(A) = 0. 

Proposition 8.8.1 Let Abe a square matrix over F with m(x) its minimal polyno- 
mial. Then A is a root of a polynomial f(x) in F[x] iff fix) is divisible by m(x) 
in F[x]. 

Proof As F is a field, F[x] is a Euclidean domain. Hence 3q(x) and r(x) in F[x] 
such that f(x) = mix)qfx) + r(v), where r(v) = 0 or degr < degm (by division 
algorithm). If /(A) = 0, the r(A) = 0, since m(A) = 0. If r(x) 7 ^ 0, then r(x) is a 
polynomial of degree less than the degree of mix), which has A as a root. Then we 
reach a contradiction as mix) is a minimal polynomial of A. Thus r(v) = 0 shows 
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that f{x) is divisible by m(x). On the other hand if fix) is divisible by mix) in 
F[x], then r(x) = 0. Hence r(A) = 0 =>> /(A) = 0 A is a root of fix). □ 

Remark A minimal polynomial mix) of A over F may not be irreducible in F[x]. 
For example, the matrix A = ( ^ j ) over R has its minimal polynomial mix) = x 2 — 
3x + 2, but it is not irreducible in R[x]. 

Corollary 1 Let A be a square matrix over F . Then its characteristic polynomial 
is divisible by its minimal polynomial. 

Proof A is a root of its characteristic polynomial kaM by Cayley-Hamilton The- 
orem. Then the corollary follows by Proposition 8.8.1. □ 

Corollary 2 The minimal polynomial of a square matrix over F is unique. 

Proof Let A be a square matrix over F and m\ix), m 2 (v) be its minimal poly- 
nomials. Then A is a root of m\ix) and by Proposition 8.8.1, m\ix) = gix)m 2 ix) 
for some polynomial gix) e F[x]. Hence deg mi = d egg + deg m 2 . Again by min- 
imality of degrees, deg mi < deg m 2 and deg m 2 < degmi. Hence deg mi = deg m 2 
shows that degg = 0. Thus gix) is a non-zero constant c e F . Let nt 2 (x) = 
x m H h a\x + ao. Then mi(i) = cx m H 1- ca\x + cao. But the leading coef- 
ficient of mi is 1. Hence c—1 proves that mi(i) = m 2 ix). □ 

Theorem 8.8.2 Let Abe a non-zero n x n matrix over F . Then A is non- singular 
iff the constant term of the minimal polynomial of A is not zero. 

Proof Proceed as in Theorem 8.5.12. □ 

Corollary If A is a non-singular square matrix over F , then its inverse is also 
polynomial in A over F . 

Remark Let V be a finite-dimensional vector space over F and T : V V a linear 
operator. If B is any basis of V , then the minimal polynomial of T is the same as the 
minimal polynomial of m(r)#. This is so because similar matrices have the same 
minimal polynomial (see Ex. 20 of Exercises-II). 

Theorem 8.8.3 Let V be an n -dimensional vector space over F and T : V —> V 
be a linear operator with its minimal polynomial mix). Then 

(i) an element X in F is an eigenvalue ofT ijfmiX) = 0; 

(ii) the characteristic polynomial /c(x) ofT and its minimal polynomial mix) have 
the same roots. 

Proof (i) If mix) divides /c(x), then /r(x) = qix) mix) for some qix) e F[x]. Hence 
k(X) = qiX)miX). If A is a root of mix), then A is also a root of /c(x) and thus A is 
an eigenvalue of T . 
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Conversely, let A be an eigenvalue of T and x e V be an eigenvector corre- 
sponding to A. Then m(T)x = m(A)x by Example 8.7.3(iii) and m(T) = 0. Since 
x / 0, m( A) = 0. Hence A is a root of m(x). 

(ii) It follows from (i). □ 


Remark The multiplicities of the roots of k(x) and m(x) may be different. For 
example, for the matrix 


A = 




k(x) = — x 3 and m(x) = x 2 , 


the multiplicities of roots of /c(x) and m(x) are different. 


Corollary Let V be an n-dimensional vector space over F and T : V — > V a lin- 
ear operator with minimal polynomial m(x) and characteristic polynomial k(x). If 
k(x) = (Ai — x) ni (A 2 — x) ni • • ■ (A* — x ) Ht , there exist integers m L such that 
1 < mi < ft / , i = 1, 2, . . . , t and m(x) = (x — Ai) mi (x — A 2 ) m2 • • • (x — k t ) mt . 


8.9 Jordan Canonical Form 

Every square matrix is not similar to a diagonal matrix but every square matrix is 
similar to an upper triangular matrix over the complex field C (see Ex. 9 and Ex. 24 
of Exercises-II). A linear transformation on a vector space represents matrices which 
differ depending on different bases. 

Given a finite-dimensional vector space V over a field F, there exist linear trans- 
formations on V such that each associated matrix in a similarity class of square ma- 
trices over F , takes a simple form in some basis, called canonical form. There are 
many canonical forms of matrices, namely, the triangular form, Jordan normal form, 
rational canonical form etc. But in this section we study only the Jordan canonical 
form. Since the Jordan canonical form is determined by the set of elementary di- 
visors, we need basic ideas of determinant divisors, invariant factors, elementary 
divisors etc., of a square matrix. We first introduce these concepts as background 
material. For other canonical forms see Exs. 21-23 of Exercises-II. 

Definition 8.9.1 Let A be a square matrix of order n over a field F and d t (A) 
denote the highest common factor of the set of all minors of order t of A, for t = 
1 , 2, . . . , r , where r is the rank of A. Then d t (A) is called the tt h determinant divisor 
of A. 

Definition 8.9.2 If we take do(A) = 1 and C[ — d[ (A) jd[-\ (A), for i = 1 , 2, . . . , r, 
where rank A = r, then e\ , e 2 , . . . ,e r are called the invariant factors of A. 

Remark di (A) —e\e 2 "-ei,iovi = 1,2 , ,r and hence the invariant factors of A 
determine uniquely its determinant divisors and conversely. 
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Definition 8.9.3 Let x \ , X2, . . . , x m be irreducible elements in F , each of which is 
a divisor of at least one of the invariant factors e \ , . . . , e r of a square matrix A 

over F . Then the prime powers x^ 11 , . . . , x^ lw , . . . , x \ rX , . . . , x^ m for which the 
indices are positive, are called the elementary divisors of A over F. 

Example 8. 9. 1 For the matrix 

(x 2 -x x 0 \ 

A= \ x- 2 x x + 4 , 

\ x 0 3 x J 

the determinant divisors di , invariant factors e/ and elementary divisors are respec- 
tively: 

d\ = hcf of {x 2 — x, x, x — 2, v, x + 4, x, 3x} = 1; 

J 2 = hcf of {x(x 2 — 2x + 2), x(x — l)(x + 4), — x 2 , 3x 2 , 3x{x 2 — x), x(x + 4), 
— x 2 ,2x(x — 5), 3x 2 } = x\ 

J3 = x 2 (3x 2 — 5x + 10); 

e\ —d\— 1, e 2 = d 2 /d\ = x, £3 = J3/J2 = x(3v 2 — 5x + 10). 

The elementary divisors of A in R[x] are v, x; 3x 2 — 5x + 10 and these in C[x] are 
x, x; x — a; x — fi, where a, f = (5 zb V95/)/6. 

We recall that if V is a finite-dimensional vector space over an algebraically 
closed field F , such as complex field C, then the characteristic polynomial of every 
linear operator T : V — ► V decomposes into linear factors in Fix]. If k\, A. 2 , . . . , X n 
(not necessarily distinct) are all eigenvalues of T, then T is diagonalizable iff there 
is an ordered basis of V consisting of eigenvectors of T. If B = {v\, V 2 , . . . , v n ) is 
such a basis in which Vj is an eigenvector corresponding to the eigenvalue Xj , then 

X\ 0 ••• 0 

0 X 2 • • • 0 

0 0 • • • k n 

We take F = C, unless otherwise stated. 

Definition 8.9.4 Corresponding to an element X in C, an elementary Jordan matrix 
or a Jordan block /( X) of order r is a square matrix of order r over C that has X 
in each diagonal position, l’s in each position just above the main diagonal and 0’s 
elsewhere. 

Thus 

/ X 1 0 ••• 0 

0 A 1 ••• 0 

J(k)= 0 0 A ••• 0 

0 0 ••• A 
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is a Jordan block corresponding to A. A Jordan block is an upper triangular ma- 
trix. 

For example, 


( 5 ), 




are Jordan blocks of order 1,2 and 3 corresponding to the eigenvalues 5, 3, and 7 
respectively. 


Definition 8.9.5 The matrix 


/-/l (Xl) 

0 

0 

0 ^ 


0 


0 

0 


0 

0 

•^3 (A 3 ) ••• 

0 

(8.3) 

0 

0 

0 

Jrfrr)) 



where each Tj(Aj) is a Jordan block corresponding to the eigenvalue A/ of a square 
matrix A is said to be a Jordan canonical form of A, where Ai , A 2 , . . . , A r may not 
be distinct and orders of the Jordan blocks may be different. 


Jordan canonical form is one of the most generally used canonical forms connect- 
ing linear transformation and matrices. But its serious inconvenience is the condi- 
tion that all eigen values must lie in the ground field. We study some other canonical 
form which needs nothing about the location of eigenvalues. Such a canonical form 
is the rational canonical form which is described in Ex. 23 of Exercises-II. 


Theorem 8.9.1 (Jordan Form Theorem) A square matrix M , over a field F such 
that the eigenvalues of M lie in F, is similar to a Jordan matrix of the form (8.3), 
which is uniquely determined , except for rearrangements of its diagonal blocks. 

Proof Since the eigenvalues of M are in F, its characteristic polynomial km(x) 
decomposes into linear factors in F[x]: 

km(x) = (X 1 - x)" 1 (A .2 - x)" 2 • • • (X m -x) n ‘, 

where n\ + ft 2 H V n t =n and A/ ’s are distinct eigenvalues of M. 

Again kmW is the product of invariant factors, and also of the elementary di- 
visors of the characteristic matrix M — xF As the prime factors of kmM in F[x] 
are linear polynomials Aj — x, it follows that the elementary divisors of M — xl 
are of the form (A; — x ) n . We suppose that {(Ai — x \) ri , . . . , (A m — x ) rm } is the 
complete set of elementary divisors of M — xl, where Ai, A 2 , . . . , A m may not be 
distinct. Let //(Aj) be the Jordan block of order rj corresponding to the eigenvalue 
A/, i = 1, 2, . . . , m. Then (Aj — x) n is both the characteristic polynomial and the 
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minimal polynomial of the matrix Jj(ki). As the minimal polynomial of Jj(ki) is 
the last invariant factor of the matrix /*• (ki) — xl and the characteristic polynomial 
of Ji (ki) is the product of all invariant factors of J/ (ki) — xl, it follows that all the 
invariant factors of Ji(ki) — xl, excepting the last one, is 1. Hence (ki — x) n is the 
only elementary divisor of Ji(ki) — xl . 

Let J = diag(/i(A.i), . . . , J m (k m )) be the Jordan matrix. Then the set of the 
elementary divisors of J — xl is given by the union of the elementary divisors 
of /i(Ai) — x I, , J m (k m ) — xl. Thus the elementary divisors of J — xl are 
given by (A.i — x ) n , . . . , (k m — x) rm . Again, as the product of the elementary di- 
visors of M — xl is equal to the characteristic polynomial km(x), it follows that 

r\ + r 2 H \~ r m = n\ + ri 2 ~\ \-n t =n. This shows that J — xl is a matrix of 

order n. Again M — xl and J — xl are both non- singular. Hence rankM = rank J . 
Consequently, M — xl and J — xl are both square matrices of order n over F such 
that they have the same rank and the same set of elementary divisors. This implies 
that M is similar to the Jordan matrix J . 

Again the set of elementary divisors of M — xl consists of the elementary divi- 
sors of Ji (ki) — xl, for i = 1,2 , . . . , m, there is one elementary divisor correspond- 
ing to each block Ji(ki). Moreover, the block diagonal matrices which differ only 
in the arrangement of diagonal blocks are similar. Consequently, the Jordan matrix 
J is determined uniquely, except for rearrangement of the diagonal blocks, by the 
given matrix M. □ 

Remark The Jordan matrix J similar to M is called the Jordan normal form (or the 
classical canonical form) of the matrix M. 

Corollary Any square matrix M over C is similar to a Jordan matrix , which is 
determined uniquely , except for rearrangements of its diagonal blocks. 

Proof As all the eigenvalues of M over C lie in C, the corollary follows from The- 
orem 8.9.1. □ 

Example 8.9.2 (i) The matrix 



is similar to the Jordan matrix 



For this purpose we prescribe a method by computing as follows: 

Step I: km(x) — det (M — xl) = —(x — 2) 2 (x — 3). 

Step II: d\(x) = 1, J 2 OO = x — 2, d^(x) = (x — 2) 2 (x — 3). 
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Fig. 8.1 Jordan normal form 
of M 




Step III: e\(x) = d\(x) = 1, e 2 (x) = d2(x)/d\(x) = x — 2 = pi(x). (say) ei(x) = 
(x - 2)(x - 3) = Pi(x)p 2 (x) (say). 


The multiplicities of the eigenvalues 2 and 3 as roots of the invariant factors are 
tabulated: 


X 

<23 (x> 

eo(x) 

e\{x) 

2 

1 

1 

0 

3 

1 

0 

0 


The symbol [(1, 1), (1)] denoting the multiplicities of the eigenvalues as roots 
of the invariant factors, collected together in the first brackets for each one of the 
eigenvalues, is called the Segre characteristic of M and it represents the Jordan 
normal form 


(ii) The matrix 


0 

0 [J] 

\ 0 0 



M 


(-6 6 

r 1 1 

\-8 5 


1 

1 

8 


has no Jordan normal form over the field R. 

km(x) = — (x + 5)(x 2 — 3x + 3) shows that all the three roots are not in R. Hence 
there does not exist any Jordan normal form of M over R. 

(iii) The Jordan normal form of the matrix 


/ 0 4 2 

M= -3 8 3 

\ 4 -8-2 

is shown in Fig. 8.1. 

Here km(x) = — (x — 2) 3 , d\(x) = 1, d 2 (x) = (x — 2), d?,(x) = —{x — 2) 3 and 
e\(x) = 1 , e 2 (x) = x — 2 , = —(x — 2 ) 2 . 

The Segre characteristic of M is [(2, 1)] and it represents J . 

(iv) Let M be a 3 x 3 matrix over C. Then all the eigenvalues of M lie in C. Find 
the different forms of Jordan matrices similar to M. 

To find the different forms of Jordan matrices similar to M, we consider all pos- 
sible cases. If e\, e 2 , e?, are invariant factors of characteristic matrix M — xl, then 


(a) e\ is a divisor of e 2 and e 2 is a divisor e^\ and 

(b) the product ^2 ^3 = a :m(x). 
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Fig. 8.2 Jordan normal form 
for the Case II(ii) 


Fig. 8.3 Jordan normal form 
for the Case III(ii) 



These properties offer different possible cases: 

Let k\,k 2 , k 3 be the rots of km(x). 

Case I: Let k\, k 2 , A . 3 be three distinct roots of icm(x). 

Then km(x) = (k\ — x)(k 2 — x)(k 3 — x ) and hence by (a) & (b), e\(x) = 1, 
e 2 (x) = 1 and e 3 (x) = (k\ — x)(k 2 — x)(A .3 — x). 

Consequently, the Serge characteristic is [(1), (1), (1)] and hence the Jordan nor- 
mal form J\ of M is 

/% 0 0 \ 

J\ = 0 k 2 0 . 

\ 0 0 k 3 J 

Case II: Let k\=k 3 and k 2 ^k\. Then km(x) = (Ai — x) 2 (k 2 — x). There are 
two possibilities: 

(i) e\{x) = 1 , e 2 (x) = k\ — x, e 3 (x) = (k\ — x)(k 2 — x) or 

(ii) e\(x) = l, e 2 (x) = l, e 3 (x) = (k\ -x) 2 (k 2 -x). 

The Segre characteristic for (i) is [(1, 1), (1)] and for (ii) is [(2), (1)]. 
Consequently, the Jordan normal form is 

tk x 0 0 \ 

J 2 = 0 ki 0 

V 0 0 k 2 J 

for (i) and for (ii) it is J 3 as shown in Fig. 8.2. 

Case III: k\ = k 2 = k 3 = k (say). Then there are three possibilities. 

(i) e\(x) = k — x, e 2 (x)=k — x, e 3 (x) = k — x. 

This shows that the Serge characteristic is [(1 , 1 , 1)] and hence the Jordan normal 
form 

(k 0 0\ 

./ 4 = [ 0 k 0 . 

\0 0 kj 

(ii) ^i(x) = l, e 2 (x) = k — x, e 3 (x) = (k — x) 2 . 

This shows that the Segre characteristic is [(2, 1)] and hence the Jordan normal 
form J 3 is shown in Fig. 8.3. 

(iii) e\(x) = 1 , e 2 (x) = 1 , e 3 (x) = (k — x) 3 . 
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This shows that the Segre characteristic is [(3)] and hence the Jordan normal 
form 



8.10 Inner Product Spaces 

In our discussion, so far, the concepts of length, angle, and distance have not ap- 
peared. In this section we study special vector spaces over R and C which admit the 
above missing concepts. Such vector spaces introduce the concept of Inner Product 
Spaces. For its motivation, we recall the geometry in R n and C n . 


8.10.1 Geometry in R n and C n 

Given two vectors x = (xi , X 2 , . . . , x n ) and y = (yi , y 2 , . . . , y n ) in R n , the real num- 
ber x\y\ + X2yi + F x n y n is called dot product (or standard inner product) of x 

and y and is denoted by x • y = (x, y). Thus x • y = (x, y) = Y^=i x tyi G R- The 
set R n endowed with this dot product is called the Euclidean n- space. The length 
of x, denoted by |x|, is defined by |x| = +yfx ~ This definition of inner prod- 
uct is not adequate for defining the length in C n . For example, for the non-zero 
x = (1 — i, 1 + /) e C 2 , the above definition of length shows that |x| =0. Sim- 
ilarly, for x = (1, 4/) e C 2 , |x| = y/ — 15. So it needs modification of the defini- 
tion of inner product in C n as (x, y) = YTi=\ x iji^ where yi represents the com- 
plex conjugate of y;. The set C n endowed with this inner product is called the n- 
dimensional unitary space. The distance between x and y in R n (or C n ) is defined 
by d(x,y) = ||x — y ||, where || || is the norm function (see Example 8.11.3). 

Remark The inner product defined in R n (or C n ) is a function of x for every fixed y. 


8.10.2 Inner Product Spaces: Introductory Concepts 

In analytical geometry and vector analysis we study real vector spaces. We now 
carry over the concept of a dot product in R n to a more abstract setting. So the 
model of the inner product spaces is the Euclidean n -space R n . 

Definition 8.10.1 Let F = R or C and V be a vector space over F . An inner prod- 
uct on V is a mapping ( , ) : V x V — >► F such that 

(i) (x, x) > 0 and (x, x) = 0 iff x = 0 (positive definiteness); 

(ii) for F = R, (x,y) = (y,x) (symmetry); 

for F = C, (x,y) = (y,x) (conjugate symmetry); 
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(iii) for F = R, (ax + fiy, z) = a(x, z) + fi(y, z) (bilinearity); 

for F = C, (z, ax + /3y) = a(z, x) + f(z, y) (conjugate bilinearity), 
for all a, f e F and x, y, z e V. 

The pair (V, ( , )) is called an inner product space over F. 

Remark 1 If there is no ambiguity, we write (V, ( , )) by simply V. 

Remark 2 There may exist different inner products on V. 

For example, for x = (x\ , X 2 ) and y = (yi , y 2 ) in R 2 , (x,y) 1 = xiyi + X2y2 and 
(x, y) = (2xi + *2)yi + ( x l + -* 2)^2 are both inner products on R 2 over R. 

Example 8.10.1 (i) Let V = C([a, b]) be the vector space of all real valued contin- 
uous functions defined on [a, b]. Then ( , ) defined by (/, g) = f(x)g(x)dx is 
an inner product and V is an inner product space. 

(ii) Let V be the vector space of all continuous complex valued functions defined 

on [0, 1]. Then ( , ) defined by (/, g) = /(x)g(x) dx is an inner product and V 

is an inner product space. 

(iii) C n is an inner product space under the inner product defined by (x,y) = 

E/=i x t y t . 

(iv) Let V be the vector space of all m x n matrices over R. Then ( , ) defined by 
(A, B) = trace (iF A) is an inner product. 

We now extend the concept of length or norm defined in R 2 or R 3 to an arbitrary 
inner product space. 

Definition 8.10.2 Let V be an inner product space and x is in V. The norm of x 
denoted by ||x || is defined by ||x || = x ) an d the distance between x and y in 

V is defined by d(x, y) = ||x — y|| for all x, y in V. 

Remark For x = (xi, X 2 , . . . , x n ) in K n , ||x|| = +yjx\ + x\ H h v 2 is just the 

distance of x from the origin. 

We now study some properties of norm functions. 

Proposition 8.10.1 Let V be an inner product space over R. Then the norm func- 
tion || || : V — >► R satisfies the following properties : 

(i) ||x || > 0 and ||x || = 0 ijfx = 0; 

(ii) ||ofx || = \a\ ||x || , Vx e V and a e R; 

(iii) ||(x,y)|| < ||x||||y||, Vx,y g V (Cauchy-Schwarz Inequality); 

(iv) Corresponding to a non-zero vector x in V , there is a vector u in V such that 
|| u || = 1 and x = \\x\\u. ( This u is called the unit vector along x.) 


Proof Left as an exercise. 


□ 
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Remark The Cauchy-Schwarz inequality in R n is (x\y\ + + • • • + x n y n ) 2 < 

(x 2 + x 2 H h x^)(y^ + y 2 H h y^), which was stated in 1821 by A.L. Cauchy. 

The corresponding inequality in C n is 

(•*1 y\ + -* 2^2 + • • • + x n y n ) 2 < (\x\\ 2 + • • • + |x n | 2 )(|yi| 2 + • • • + Ij/il 2 )- 

We recall that two non-zero vectors x and y in R n are orthogonal iff their dot 
product x • y = 0. This leads to the following definition. 

Definition 8.10.3 Two non-zero vectors x, y in an inner product space V are said 
to be orthogonal iff their inner product (x, y) =0 and said to be orthonormal iff 
(x,y) = 0 and ||x|| = 1 = ||y||. 

Definition 8.10.4 A basis B = {v\, V 2 , . . . , v n } of an inner product space V is 
said to be an orthogonal basis (or an orthonormal basis) iff (vi, Vj) =0 for i ± j 
(or (vi , Vj) = 0 for i ^ j and || Vi || = 1 for all /). 

Example 8.10.2 (i) The unit vectors e\ = (1, 0, . . . , 0), e^ = (0, 1, 0, . . . , 0), . . . , 
e n = (0, 0, . . . , 0, 1) in R n are pairwise both orthogonal and orthonormal. The set 
B = {e \ , ^2, • • • , e n ) is an orthonormal basis of R n . 

(ii) Let V be an inner product space. Then any orthonormal generating set S = 
{v\, V 2 , . . . , v m J forms a basis of V. 

This is so because if J2?=i a i v i — 0 f° r some oti e F, then 0 = t a i ( v i’ v k) = 
otk (by taking inner product with a fixed Vk in S). This is true for all k = 1, 2, . . . , m. 
Hence S is linearly independent such that L(S) = V. Consequently, S forms a basis 
of V. 

We now prescribe a process of orthogonalization to find an orthogonal basis for 
certain inner product spaces. 

Theorem 8.10.1 (Gram-Schmidt Orthogonalization Process) Every non-zero 
finite -dimensional inner product space V over F has an orthogonal basis. 

Proof Let dim V — n and B — {v \ , V 2 , • • • , v n } be a basis of V. We now construct 
an orthogonal basis S = {u \ , U 2 , . . . , u n }. We take u\ — v\ . If n = 1, the theorem is 
proved. If n > 1, we take U 2 by defining U 2 = V2 —otivi, where a\ = ^ 2,Ul J , (oq e F 

ll^i ll 

is called the Fourier coefficient of V 2 with respect to v\). Then (u 2 ,u\) =0. Since 
U2^0 and {u 2 ,u\) =0, the set {u U 2 } is linearly independent. Hence for n — 2, 
the theorem is proved. We assume that the set A = {u \ , U 2 , . . . , u m - 1 } is orthogonal 
and such that L(A ) = L({ v\, V 2 , . . . , 1 < m < n. Let u m be the non-zero 

vector in V defined by u m = v m a\v\ a m -\ v m -\ for which (u m , U[) — 0 

for each i = 1 , 2, . . . , m — 1 . 

In general, 




m — 1 


E 


{Vm,Uj) 
\\Ui II 2 


Ui , 


m = 2, 3, . . . , n. 
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In this way, for given m linearly independent vectors in V, we can construct an 
orthogonal set of m vectors. As dim V = n, in particular, from the given basis B , we 
can construct the orthogonal set S of n vectors. This S gives the required orthogonal 
basis of V. □ 

Corollary 1 Every non-zero finite-dimensional inner product space V has an or- 
thonormal basis. 

Proof Let S = {u\, U 2 , . . . , u n } be an orthogonal basis of V. Then B = , jj^j, 

. . . , tti\ } forms an orthonormal basis of V . □ 

II u n II 

Corollary 2 Any set of non-zero mutually orthogonal vectors of a non-zero finite- 
dimensional inner product space V is either an orthogonal basis of V or can be 
extended to an orthogonal basis ofV. 

Proof Let dim V = n and S = {u\, U 2 , . . . , u m ) be the given set of non-zero mutu- 
ally orthogonal vectors of V . If m = n, there is nothing to prove. So we assume that 
m < n. We now extend the set S' to a basis B = {u\, U 2 , . . . , u m , u m +\, . . . , u n }. If 
we take Vi —Ui for i = 1, 2, . . . , m and then construct u w +; from w m+ ; as above, 
for i = 1, 2, . . . , n — m, then the set A = {v\, V 2 , . . . , v n } forms an orthogonal basis 
of V , where the first m vectors of A are the original vectors u\, U 2 , . . . , u m . □ 

Corollary 3 Any set of orthonormal vectors of V of a non-zero finite -dimensional 
inner product space V can be extended to an orthonormal basis of V . 

We now introduce the concept of an orthogonal complement in an inner product 
space. 

Proposition 8.10.2 Let V be a non-zero finite-dimensional inner product space 
and U be a subspace of V. Then there exists a subspace W of V such that V = 
U © W, where each vector of W is orthogonal to every vector ofU. 

Proof Clearly, U is also an inner product space under inner product induced from V . 
Then U has an orthonormal basis B = {e\, e 2 , . . . , e n }, say. We construct W as de- 
fined by W = {w e V : (w, ef) = 0, 1 < j < n). Then W is a subspace of V and for 
any element u eU, 3 an element w eW such that (u, w) = 0. Again if v e V, then 
the element 

n 

v — ^(u, ei)ei G W, since 
i = 1 

(\V — y^(u, ei)ei, ej}j = (v, ef) — (v, ef) ||^|| 2 = 0, Wk such that 1 < k < n. 
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Hence 


n 

V = '^2(v,e i )e i + 

i = 1 

= u + w for some ueU and some w eW. 

This shows that V = U + W. Clearly, for v e U D W, (x, x) = 0 implies jc = 0. 
Hence V = U®W. □ 

Definition 8.10.5 The subspace W defined above is called the orthogonal comple- 
ment of U in V and is denoted by U^. 

Remark If V is an inner product space and U is a subspace of V , then V = U ®U- L 
and hence any vector v in an inner product space V has the unique representation as 

v = u + w, for some u e U and some w eW = U^. 

Definition 8.10.6 The decomposition V = U 0 U 1 - is called the orthogonal de- 
composition of V with respect to the subspace U and V is called an orthogonal 
direct sum of U and U ± . 

Example 8.10.3 (i) Let U = {(v, y) e R 2 : 3x — y = 0}. Then U 1 - = {(v, y) e R 2 : 
x + 3y = 0}. Thus the orthogonal complement of the line U in R 2 is the line passing 
through the origin and perpendicular to the line U in R 2 . 

(ii) The orthogonal complement of the subspace U generated by the vector v = 
(2, 3,4) in R 3 is the plane 2x + 3y + 4z = 0 in R 3 passing through the origin and 
perpendicular to the vector v. 



8.11 Hilbert Spaces 

The concept of Hilbert spaces is not an actual subject of this book but we feel that 
the readers should have some knowledge of such spaces. So we need the basic ideas 
of normed linear spaces, Banach spaces etc., as a background. 

Definition 8.11.1 Let X be a non-empty set. A metric on A is a real valued function 
d : X x X R satisfying the conditions: 

M( 1) d(x, y) > 0 and d(x, y) = 0 iff x = y; 

M( 2) d(x, y) = d(y, x) (symmetry); 

M( 3) d(x, y) < d(x, z) + d(z, y) (triangle inequality) 
for all x, y,z in A. 

d(x, y) is called the distance between v and y. The pair (A, d) is called a metric 
space. 
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Example 8.11.1 (i) Let X be an arbitrary non-empty set. Then d : X x X^R 
defined by 


d(x,y) 


0 ifx = y 

1 ifxjty 


is a metric on X and ( X , d) is a metric space. 

(ii) Let R be the real line. Then |x | defined by 


1*1 = 


x if x > 0 
—x if x < 0 
0 if x = 0 


is called the absolute value function. Then d(x, y) = |x — y | is a metric on R. 


Definition 8.11.2 Let ( X , d) be a metric space. A Cauchy sequence in X is a func- 
tion /: N + — > X such that for every positive real number €, there exists a positive 
integer m such that d(f(i),f(j )) < € for all integers i > m and j > m. 

Definition 8.11.3 A complete metric space is a metric space in which every Cauchy 
sequence is convergent. 

Example 8.11.2 [0, 1] is a complete metric space but (0, 1] is not so. 

In this section we consider vector spaces over F = R or C. 

Definition 8.11.4 A normed linear space is a vector space X on which a real valued 
function || || : X -> R called a norm function is defined satisfying the conditions: 

N( 1) ||x || > 0 and ||x|| = 0 iff x = 0; 

N( 2) ||x + y || < ||x || + ||y ||; 

N( 3) ||ax|| = |a|||x||, 

for xjel and a e R or C. 


Remark 1 The non-negative real number ||x || is considered the length of the vector 
x and hence the notion of the distance in X from an arbitrary point to the origin is 
available. 


Remark 2 From A (3), it follows that || — x|| = ||(— l)x|| = ||x||. 

Remark 3 A normed linear space is a metric space with respect to the metric in- 
duced by the metric defined by d(x, y) = ||x — y || . 

Definition 8.11.5 A Banach space A is a normed linear space which is complete 
as a metric space i.e. every Cauchy sequence in X is convergent. 
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Example 8.11.3 (i) R n is a real Banach space under the norm ||x || defined by 


11*11 = 



1 

2 


(ii) C n is a complex Banach space under the norm ||z|| defined by 


lull = 



1 

2 


(iii) Let R 00 be the set of all sequences x = (x\, X 2 , - - x n , ...) of real numbers 
such that YlnLi \ x n\ 2 converges. Under the norm function defined by 

( 00 

y>«i 2 

n= 1 



the vector space R 00 becomes a real Banach space, called infinite-dimensional Eu- 
clidean space. Similarly, C 00 is a complex Banach space, called infinite-dimensional 
unitary space. The real and complex spaces h are respectively R°° and C°° . 

(iv) For any real number p such that 1 < p < oc, the normed linear space Z" of 
all n -tuples x = (xi , X 2 , . . . , x n ) of scalars in R or C, with the norm function defined 
by 

11*11 p= \ Yl,\ x i\ P 

\i = 1 

is a Banach space. 

(v) For any real number p such that 1 < p < oo, the normed linear space l p of 
all sequences x = {xi, X2, . . . , x w , . . .} of scalars in R or C such that YlnLi \ x n\ p is 
convergent is a Banach space under the norm function defined by 

( 00 

y ] I x n \ P 
n= 1 




Clearly, the real h space is the infinite-dimensional Euclidean space R°° and the 
complex I 2 space is the infinite-dimensional unitary space as defined in (iii). 


Banach spaces are vector spaces providing with the idea of the length of a vector. 
But it fails to provide the concept of the angle between two vectors in an abstract 
Banach space. A Hilbert space is a special Banach space having an additional struc- 
ture providing with the concept of orthogonality of two vectors. This type of spaces 
also fails to provide the main geometric concept of the angle between two vectors in 
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general. But the dot product in R n yields both the concepts of the angle and orthog- 
onality of two non-zero vectors. We now proceed to give the definition of a Hilbert 
space. 

Definition 8.11.6 A Hilbert space is a complex Banach space X in which a func- 
tion ( , ) : X x X — > C is defined satisfying the following conditions: 

H( 1): (ax + py, z) = a{x, z) + P(y, z)\ 

H(2): (x,y) = (y,x); 

H( 3): (x,x) = \\x\\ 1 . 

Then (jc, ay + fz) = a{x, y) + fi(x, z), for all x, y, z in X and a, ft in C. 

Remark A Hilbert space is a complex Banach space whose norm is defined by an 
inner product. 

Example 8.11.4 The spaces and I 2 are the main examples of Hilbert spaces, 
where (x,y) is defined by 


(x,y) = 

i=i 

00 

= 'Ej Xn ^ n fo *x,y€l 2 . 

n = 1 

We recall the parallelogram law from elementary geometry that the sum of 
squares of the sides of a parallelogram equals the sum of squares of its diagonals. 
An analogue of this law holds in a Hilbert space. 

Proposition 8.11.1 (Parallelogram law) Let X be a Hilbert space. For any two 
vectors x and y in X , \\x + y\\ 2 + \\x — y\\ 2 = 2||x|| 2 + 2||y|| 2 . 

Proof Wx + yf+Wx-yW 2 = (x + y, x + y) + (x-y, x -y) = 2(x, x) + 2(y, y) = 

2||*|| 2 + 2 ||y || 2 . □ 

Corollary 1 Let X be a Hilbert space. Then for any x,y in X, the following identity 

4(*, y) = \\x + y|| 2 - \\x - y|| 2 + i\\x + iy\\ 2 - i\\x - iy\\ 2 holds. 

Proof By converting the right hand expression into inner products, the identity is 
verified. □ 

Corollary 2 Let X be a complex Banach space whose norm satisfies the parallel- 
ogram law. If an inner product is defined on X by the identity of Corollary 1, then 
X is a Hilbert space. 
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Proof Left as an exercise. □ 

Remark Hilbert spaces are precisely the complex Banach spaces in which the par- 
allelogram law is valid. 

Definition 8.11.7 Let X be a Hilbert space and x, y be two vectors in X. Then x is 
said to be orthogonal to y denoted by x _L y iff (x,y) =0. 

Remark Let X be a Hilbert space. 

(i) v _L y o y ± x, since (x, y) = (y, x). 

(ii) (x, x) = \\x\\ 2 and x _L 0 = 0 for every x in X. Clearly, 0 is the only vector 
orthogonal to itself. 

(iii) v _L y =>► \\x + y \\ 2 = \\x — y\\ 2 = \\x\\ 2 + ||y|| 2 (it represents geometrically 
Pythagorean Theorem). 

Definition 8.11.8 In an inner product space X , for any non-empty set Y of X 
its orthogonal complement , denoted by Y 2 -, is defined by Y 2 - = {x e X : x _L 
y for every y in Y}. 

Clearly, {0} 1 - = X, Z- 1 = {0} and Z n Z 1 = {0}. 

Proposition 8.11.2 (Cauchy-Schwarz inequality) Let X he a Hilbert space and 
x, y he any two vectors in X. Then |(x, y)\ < ||x|| ||y||. 

Proof Let x be a non-zero vector in X. If y = 0, then the result is trivial. If y / 0, it 
is sufficient to prove that |(x, y)/\\y || | < ||x||. If ||y || = 1, then 0 < ||x — (x, y)y \\ 2 = 
(x,x)-(x,y){x,y) = \\x\\ 2 - |(x, y)\ 2 =>■ |(x, y)| < ||x||. If ||y|| / 1 , then the result 
|(x, y)\ < ||x || ||y|| follows immediately. □ 


8.12 Quadratic Forms 

The theory of quadratic forms arose through the study of central conics in the Eu- 
clidean plane R 2 and central conicoids in the Euclidean space R 3 . We now carry 
this study over to quadratic surfaces in the Euclidean n -space R /7 . To classify a cen- 
tral conic ax 2 + 2 hxy + by 2 = 1 in R 2 , we reduce it in the form a\ X 2 + h\ Y 2 = 1 
by a non-zero non-singular linear transformation from (x, y) to (X, Y) defined by 
x = X cos 0 — Y sin 0 and y = X sin 0 + Y cos 0 , which geometrically means a rota- 
tion of axes. The rotation of axes implies a change by an orthogonal transformation. 

An expression of the form 

n 

q(x 1 , X2 , . . . , x n ) = ^2 a i x f + 2 ^2 a ij x i x j » a i , CLij £ R 
* = 1 i<j 


is called a quadratic form in R n . 



8.12 Quadratic Forms 


335 


In general, given a field F, by a quadratic form on a vector space F n over F , 
we mean a map q : F n — > F , which is a homogeneous polynomial function of 
degree 2 of the coordinates. For example, the function / : R n -> R defined by 
/(xi,x 2 , ..., x n ) = x\ + x| H b x 2 is a quadratic form in R n . 

Let 


q{xi,X2,...,x n ) = 'Y^b i jx t x j , 


/, j = 1, 2, . . . , n 


be a quadratic form over F of characteristic different from 2. By taking atj — 
lj 2 jl , the given quadratic form is transformed into JL j aijXiXj , where = aji . 
It can be expressed as a product A2C, where X is the row matrix (xiX 2 . . . x n ) and 
A is the symmetric matrix A = (<%•). 

For example, the symmetric matrix A associated with the quadratic form 3x 2 + 
4y 2 + 5z 2 + 4xz is 


A = 




8.12.1 Quadratic Forms: Introductory Concepts 

Definition 8.12.1 Let V be a real inner product space of finite dimension and T : 
V — >► V be a symmetric linear operator (i.e., T (x, y) = T(y, x) for all x, y in V). 
Then the mapping q : V — >► R defined by q{x) = (T(x),x) is called a quadratic 
form on V . 

We now establish a relation between a quadratic form and a symmetric matrix. 

Theorem 8.12.1 Every real symmetric matrix of order n corresponds to a 
quadratic form on R n and conversely , every quadratic form on R n corresponds 
to a real symmetric matrix of order n. 

Proof Let V = R ,z and A = (aij) be a symmetric matrix of order n over R. We 
take B the usual orthonormal basis of V i.e., B = {e\ = (1, 0, . . . , 0), . . . , e n = 
(0, 0, . . . , 1)}. Let T : V — > V be the linear operator corresponding to A with re- 
spect to the basis B. Then q : V — >► R, defined by q(x) = (T(x), x) is the quadratic 
form corresponding to T : V — > V. For x = (xi,X 2 , ...,x n ) = JLx^-, T(x) = 

J2i x iT( e i) = Y.i,k x i a ik e k => q(x) = T,iJ a ij x i x j = T,i a u x f + 2 T,i<j a ij x i x j- 

This proves the first part. 

Conversely, let q be the quadratic form q : V — > R corresponding to a sym- 
metric linear operator T : V — > V, where V = R n . Then q(x) ={T(x),x) = 
( I2i x i T ( e i)’Y,j x j e j ) = I2i,j({T(.ei),ej))xiXj = T,ij a ij x i x j, where = 
( T (ei ) , ef). Since T is symmetric, A = (a^) is also symmetric. □ 
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This matrix A is called the matrix corresponding to the quadratic form q on K n 
with respect to the basis B . We now study the effect of change of basis of B of V 
on A. 

Theorem 8.12.2 Let q be a quadratic form on R w and B\ = {e\, 6 ? 2 , • • • , e n }, 
B 2 = {/ 1 , fi, • • • , fn) be two orthonormal bases of K n . If A, B are the matrices 
corresponding to q with respect to bases B \ , B 2 respectively and C is the matrix cor- 
responding to the linear transformation T : V -> V mapping the basis B\ onto B 2 , 
then B = CAC 1 . 

Proof If A = ( atj ), then by the given condition, atj = (T(et), ef). Hence q{x) = 
(T (x), x) = Jfij aijXiXj , where x = x^i. Hence 


q(x) = X f AX, 


/ x\\ 


where X = 


X2 


\XnJ 


As T sends a basis B\ onto another basis B 2 , T is non-singular and hence 
its corresponding matrix C = (cyy) is also non-singular. Then f = JfjCijej. If 
(jl , yi, • • • , yn) are the coordinates of x with respect to B 2 , then x= yt f . Con- 
sequently, 


x = !]>’<■/<■ = J2 yiCi J e j 

i ij 


i i,j 


xj = J2 c ‘jy‘ => x r = y ( c, 


where Y is the column vector 

\yn) 


Hence q(x) = X 1 AX = Y 1 C AC 1 Y =>■ the matrix of q with respect to the basis Bi 
is B = CAC t . □ 


Theorem 8.12.2 introduces the concept of congruent matrices. 

Definition 8.12.2 Let A and B be two n x n symmetric matrices over R. Then 
B is said to be congruent to A iff there exists a non-singular matrix P such that 
B = P r AP. 


The relation of being congruent on the set of all real n x n symmetric matrices is 
an equivalence relation, it partitions the set into congruence classes. 
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Definition 8.12.3 Two quadratic forms on R n are said to be congruent iff their 
corresponding associated symmetric matrices are congruent. 

We now look for a suitable basis for R n to obtain a canonical or normal form for 
a congruence class. 

Theorem 8.12.3 (Sylvester’s Theorem) Any real symmetric matrix A of order n is 
congruent to a matrix of the form 



where I p , I q represent unit p x p and q x q real matrices and 0 is the matrix of 
zeros. Moreover ; p and q are uniquely determined hy A. 

Proof As A is a real symmetric matrix, all the characteristic roots (eigenvalues) of A 
are real. We arrange them in such a way that the first p of them , X 2 , . . . , X p (say), 
are positive, the next q of them, — k p +\ , —\ p+ 2 , • • • , ~k p + q (say), are negative and 
the last n — p — q of them are all zero. Then there exists an orthogonal matrix C 
such that 


A 1 


^2 


\ 


C~ l AC = 


A+i 




0 ) 


Let D be the diagonal matrix, where 


( \ 11 1 \ 

D = diagl —=, . . . , -=, . . . , 1, . . . , 1 ). 

WM V i ^/ 7 + 1 


Then 


D(C- l AC)D=\ -I q and P = CD 
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is a non-singular matrix such that P l = D t C t = DC 1 . Hence 

shows that A is congruent to B . Uniqueness p and q is proved after the corollary. 

Corollary Any quadratic form q{x) = J2i j a ij x i x j ls congruent to a quadratic 
form 

x^ + xj + ---+x 2 p -x 2 p+l -x 2 p+2 x 2 p+q , 

known as the normal (or diagonal) form of q{x ), where p and q are determined 
uniquely by q(x). 

Uniqueness of p and q Let 



p p+q 

q(x) = J2 x f- x i and 

i= 1 i=p+ 1 

(A) 

t t+S 

q(x) = J^yf - I] yf 

i = 1 i=t + 1 

be two canonical expressions of the same quadratic form q(x) with respect to bases 
B = {u\,U 2 , , u n } and S = {v i,V 2 , ... , v n }, respectively. We claim that p = t 
and q = s. If possible, p ^ t. Without loss of generality, we assume that p > t. 
Let U = span{i/i, U 2 , . . . , u p } and W = span{n r+ i, . . . , n^}. Then dimU = p and 
dim W — n — t. Hence dim U + dim W = p + n — t > n, as p > t by assumption => 
the subspaces U and W have non-zero common vectors =^3a non-zero vector x in 

unw. 

This non-zero vector x can be expressed as 


x =X\U\ H VXpUp = y t+ iv t +\ H b y n v n - 

As for this particular x, x p +\ = • • • = x n = 0 and y\ = • • • = y t = 0, we have 
from (A) 

x\ + x\ H b Xp > 0 and - yf +l y„ < 0. 

Hence x\ = • • • = x p = 0 and y t + \ = ... = y n = 0=^x=0. 

This contradiction shows that p <t. Similarly, t < p. Hence p — t. Again p + 
q = t + s q = s, since p — t. 

Thus any quadratic form q(x) = J2i / a ij x i x j is congruent to a quadratic form 


of the type 

X l + x 2 H 1" X l - x l+l X l+q > ( B ) 

where and g are uniquely determined by q(x). □ 
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Definition 8.12.4 The integer p is called the index of q(x ), which is the number 
of positive terms in the expression (B). If r is the rank of the matrix A associated 
with the quadratic form q(x), then the rank of q(x) is defined to be the rank of A 
and signature of q(x) is defined to be the integer p — (r — p) = 2p — r, which is 
also called the signature of the matrix A. 

Remark If a real symmetric matrix A of rankr is congruent to the matrix 



then r = p + q is the rank of A and p — q is the signature of A. 

A quadratic form is of several types given in the following definition. 

Definition 8.12.5 A quadratic form q : V R is said to be 

(i) positive definite provided q(x) >0 for all non-zero vectors x in V; 

(ii) negative definite provided q{x) < 0 for all non-zero vectors x in V\ 

(iii) positive semidefinite provided q(x) > 0 for all v in V and q(x) = 0 for some 
non-zero vectors x in V; 

(iv) negative semidefinite provided q(x) <0 for all x in V and q(x) = 0 for some 
non-zero vector v in V ; 

(v) indefinite provided there exist vectors u and v in V such that q(u) >0 and 
q( v) < 0. 

The associated real symmetric matrix A is said to be positive definite, negative def- 
inite, positive semidefinite etc., according to q(x) being so. 

We now characterize quadratic forms with the help of their associated matrices. 

Theorem 8.12.4 Let q : R n -> R be a quadratic form and A be its associated 
matrix. Then q(x) is positive definite iff the eigenvalues of A are all positive. 

Proof Let the quadratic form q(x) be positive definite and 



be a canonical form of A. Then D is the matrix corresponding to q{x) with respect to 
some basis B of R w . If (x \ , X 2 , . . . , x n ) is the coordinate of x with respect to B , then 


q(x) =x\ +x\-\ \-x 2 p -Xp +l 


X p+q- 


P 


n—p 


if p < n, then for x = (0, . . . , 0, 1 , 1 , . . . , 1), q (x) < 0. As q is positive definite, this 
contradiction shows that p = n. 
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Conversely, let all the eigenvalues of A be positive. Then A is congruent to the 
identity matrix and hence q(x) = x\ + x\ + • • • + x\ with respect to some basis 
of R” . This implies that q is positive definite. □ 

Corollary A quadratic form q :K n is positive definite if rank of q = signature 
ofq =n. 


Theorem 8.12.5 A quadratic form q : K n —> R is negative definite if all the eigen- 
values of the matrix A corresponding to q(x) are negative and is indefinite if A has 
both positive and negative eigenvalues. 

Proof Left as an exercise. □ 

Definition 8.12.6 let V be an n -dimensional inner product space over R and T : 

V — > V be a symmetric linear operator. Then T is said to be positive definite iff 
(T (x), x) > 0 for all non-zero vectors x in V. 

Proposition 8.12.1 Let V be an n-dimensional inner product space over R. IfT : 

V — y V is positive definite , then all eigenvalues of T are positive. 

Proof Let v / 0 be an eigenvector of T corresponding to an eigenvalue X. Then 
(T(v), v) = {Xv, v) = X(v, v). Since (v, v) > 0 and (T(v), v) > 0, it follows that 
X>0. □ 


8.13 Exercises 


Exercises-II 

1 . Find a linear operator T : R 2 — ► R 4 corresponding to the matrix 


/ 3 -1\ 

-2 4 

0 2 

V 1 0 / 


[Hint. See Example 8.6.2.] 

2. Let V be an n-dimensional vector space over R and B be any basis of V. If 
I : V — >► V is the identity map, show that m(I)s = I n (identity matrix). 

3. Let V be the vector space generated by three functions f(x) = 1 ,g(x) = 
x, h(x) = x 2 . If D : V — >► V is the differential operator, find m(D)s , where 
£ = {/(*), g(x), h(x)}. 

Ans.: 

0 1 0 \ 

0 0 2 . 

0 0 0 / 
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4. Let V be a vector space and A, B be subspaces of V. Show that 

(a) the vector spaces (A + B)/B and A /(A fl B) are isomorphic; 

(b) the vector spaces (A + B)/ A and B/(A D B) are isomorphic 

(Second Isomorphism Theorem or Quotient of a Sum Theorem). 

[Hint. Define the map T : A — > (A + £)/£ by T(x) = x + B. Then T is a 
linear transformation and onto. Clearly, ker T = AH B. Apply the First Isomor- 
phism Theorem.] 

5. Let A and B be subspaces of a vector space V such that SCACf. Show that 
the vector spaces (V / B ) / (A /B) and V / A are isomorphic (Third Isomorphism 
Theorem or Quotient of a Quotient Theorem for Vector Spaces). 

[Hint. Define the map T: V/B -> V / A by T (x + B) = x + A. Then T is 
onto and a linear transformation. Clearly, ker T = A/ B. Apply the First Iso- 
morphism Theorem.] 

6. Let V be the vector space of 2 x 2 real matrices and M=(^)ef. If T : 
V — ^ V is the linear operator defined by T (A) = AM — M A, find a basis of 
ker T . 

[Hint. 


“HC »)—((; OMo o)j 

K x y\ (—2 u 2x + 2y — 2i>\ _ /0 0\ 

w u y G ’ y— 2w 2w y yO Oy 

Hence u = 0 and 2x + 2y — 2v = 0 u = 0 and x + y = v dimker T = 2. 

If we take x = 1 , y = — 1 , then v = 0 and if v = 1 , y = 0, then v = 1 . Hence 

the matrices 5 ^ ) and C = ( ^ ^ ) form a basis of ker T.] 

7. Show that there do not exist nxn real matrices A and 5 such that AB — BA = I 
(identity matrix). 

[Hint. If AB — BA = I , then tr (AB) — tr (BA) = n. Hence 0 = n, which is 
a contradiction.] 

8. Let V be the vector space of 2 x 2 real matrices and T : V — >► R 4 be the linear 
map defined by r((^ J )) = (x, y , s, t). Verify the Sylvester Law of Nullity. 
[Hint, ker T = {( ° ° )} and Im T = R 4 .] 

9. Show that the matrix A = ( ^ j ) cannot be similar to a diagonal matrix. 

[Hint. Suppose D = ( q 1 ^ ) is such that A is similar to D. Then there exists 

a non- singular matrix P = (^) such that P A P~ l = D. Then it can be shown 
that a = 0 = c. Hence ad — be = 0 a contradiction.] 

10. Let V be the inner product space R[v] of all polynomials over R, with inner 
product defined by (/(x), g(x)) = f(x)g(x)dx. Given the sequence S = 
{1, x, x 2 , x 3 , . . .} of linearly independent vectors, find an orthogonal sequence 
of linearly independent vectors. 
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[Hint. Let B = {u\(x), U 2 (x), ui(x ), . . .} be the required orthogonal se- 

r 1 % 

quence. Then u\(x) = 1, U 2 (x) =x 4 • 1 = x, u?,(x) = x 2 — i, z/ 4 (x) = 

J-i dx 

x 3 — |x, ... by the Gram-Schmidt Orthogonalization Process.] 

1 1 . Find an orthonormal basis for the space of solutions of the linear equation 3x — 
2y + z = 0. 

[Hint. For z= 1, the equation 3x — 2y + 1 = 0 has values x = 1 , y = 2 or 
x = 3, y = 5. Clearly, u = (1, 2, 1) and v = (3, 5, 1) are linearly independent. 
If V is the space of solutions of the equation, then dim V = 2; u and v form a 
basis of V. Using the Gram-Schmidt orthogonalization process -^(1,2, 1) and 

-^=(2, 1, —4) form an orthonormal basis of V.] 

12. Show that any orthogonal set of non-zero vectors in an inner product space V 
is linearly independent. 

[Hint. Let B = {ut} be an orthogonal set of any non-zero vectors. Suppose 

a\u\ + (X 2 U 2 H h oi n u n — 0. Then for any uj, 0 = (0, uj) = (a\u 1 + U 2 U 2 + 

• • • + otjUj + • • • + a n u n , Uj) = otj ( Uj , Uj) for j = 1, 2, . . . , n.] 

Remark A maximal orthonormal set in an inner product space V is called a 
Hilbert Basis for V. 


13. Let V be the vector space of real valued differentiable functions over R and 
n ^ 0 be an integer. If D : V — > V is the differential operator on V, show that 
the functions shmx and cos nx are eigenvectors of D 2 . 

[Hint. D 2 (sinnx) = — n 2 sinnx =>► sin^zx is an eigenvector of D 2 corre- 
sponding to the eigenvalue — n 2 .\ 

14. Find the Jordan canonical form of the matrix 



Ans.: 



15. If a quadratic surface is represented by q(x,y,z ) 
2zx = 1 , find its matrix. 

[Hint. Consider the matrix 


: x 2 + 2y 2 + 2z 2 + 2xy + 



associated with q. The eigenvalues of A are 0, 2, 3. Then q is reduced to 2 u 2 + 
3v 2 = 1. It represents an elliptic cylinder in R 3 .] 
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16. Find all quadratic forms q(x,y) of rank 2 which are in canonical forms. Deter- 
mine the conic representing the equation q(x, y) = 1. 

[Hint, rank# = 2 shows that the possible canonical forms of q are x 2 + 
y 2 , x 2 — y 2 or — x 2 — y 2 .] 

17. Find the minimal polynomial of A = ( 2 2 2 ) over R. 

[Hint. A 2 = (j^) = 8/ A 2 — 8/ = 0=^A satisfies the polynomial 
f(x) = x 2 — 8. Let g(x) be a polynomial of degree 1 such that g(A) = 0. 
Then g(x) = ao + a\x , a\ ^ 0 and g(A) = aol + aiA = 0=^A = bl , where 
b = — a^~ l ao^ a. contradiction, since A is not in the form bl.] 

18. Let U, V and W be finite-dimensional vector spaces over F. If T : U 

V and S : V -> W are linear transformations, show that rank^ o T) < 
min{rank T, rank 5*}. 

[//fiu. The dimension of a subspace < dimension of the whole space => 
dimlm^ o T) < dunlinS. Hence rank( l S o T) < ranked 

19. Show that a square matrix A is orthogonally diagonalizable iff A is symmetric. 

[Hint. A is orthogonally diagonalizable =>► 3 an orthogonal matrix P such 
that P~ l AP = D (diagonal matrix). Again P r P = / = PP r => P _1 = P r . 
This implies A = PDP r . Hence A* = A. The converse part is similar.] 

20. Show that two similar matrices have the same minimal polynomial but its con- 
verse is not true. 

[Hint. Let A and B be two similar matrices over F . Then there exists a 
non-singular matrix P of order n over F such that B = PAP~ l . Let m(x) 
and q(x) be the minimal polynomials of A and B , respectively. Hence B 2 — 
BB — PA 2 P~ l . By induction it follows that B n = PA n P~ l for all positive 
integers n. Suppose m(x) = ao + a\x + • • • + xL Then m(A) = 0 and hence 
m(B) = aol + a\B + • • • + B l = Pm(A)P~ l = 0. Again by the property of 
minimal polynomials, m(x)\q(x) and q(x)\m(x); and their leading coefficients 
are both 1. Hence m(x) =q(x). 

The converse is not true. The matrices A = (* *) and B = }) have the 

same minimal polynomial x (x — 1). But rank A = 1 and rank B = 2 =>► A is not 
similar to B by Proposition 8.6.4.] 

21. Prove that any square matrix A, of order n , over a Euclidean domain F, is 
equivalent to a diagonal matrix of the form 

(ei \ 


V 0/ 

called the Smith normal form of A in honor of H.J.S. Smith (1826-1883), where 
each ei is a divisor of et+ 1 , for i = 1, 2, . . . , r — 1, where r = rank A. 

[Hint. See Hazewinkel et al. (201 1).] 
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22. Let M n (F ) be the set of all n x n matrices over a field F. Then a matrix A in 
M n (F) is said to be triangular iff atj = 0 whenever i > j or aij = 0 whenever 
i < j (i.e., iff all the entries below or above the main diagonal are 0). A matrix 
of this form is called triangular. 

Let A e M n (F) be a triangular matrix. Show that 

(a) det A is the product of its diagonal entries; 

(b) if no entry on the main diagonal is 0, then A is invertible, otherwise A is 
singular; 

(c) if A has all its characteristic roots in F , then there is a non- singular matrix 
P in M n (F) such that PAP~ l is a triangular matrix. 

[Hint. See Theorem 6.4.1, Herstein (1964, p. 287).] 

23. Let V be a finite-dimensional vector space over F and T : V — > V a linear 
operator. A subspace U of V is said to be a T -invariant subspace of U iff 
T(U) c U and V is said to be a direct sum of (non-zero) T -invariant spaces 
Vi, V 2 , • • • , Vr iff v = Vi © V 2 © • • • © Vr and T(Vi) c V t , i = 1, 2, . . . , r. Let 
Ti be the restriction of T to the invariant subspace Vi of V for each i . Then T is 
said to be decomposable into the operators Ti or T is said to be the direct sum 
of Ti ’s, denoted T = T\ 0 T 2 0 • • • 0 T r . 

Let V be a finite-dimensional vector space over F and T : V V be linear 
operator such that V = V\ 0 V 2 0 • • • 0 V r , where each V\ is a T -invariant 
subspace of Land T is the direct sum of T-s, where 7} is the restriction of T to 
the invariant subspace Vi . Clearly, the eigenspace of T corresponding to each 
eigenvalue is a T -invariant subspace of V . 

Then the following statements are true: 

(a) If x(x) is the characteristic polynomial of T : V — > V and Xi CO is the char- 
acteristic polynomial of Ti : V/ — > V/ , then x( x ) = Xi( x )X2 (*)••• Xr CO- 

(b) If A 1 is the matrix of Ti : V/ -> Vi relative to some ordered basis B\ of Vi, 
then V = A 1 ® A 2 0 • • • 0 A r iff the matrix M of T relative to the ordered 
basis B which is the union of Bi ’s arranged in the order B \ , B 2 , . . . , B r , is 
given by 


(M 

0 • 

0 \ 

0 

A2 • ■ 

0 

u 

0 • 

A r ) 


(c) If m(x) = ao + a\x 4 b a t - \x* 1 + x l is in F[x], then t ^t matrix 


( 0 

1 

0 •• 

0 

0 

1 

0 

0 

0 •• 

\-ao 

—a\ 

-a 2 •• 


0 \ 
0 


1 

■a t - 1 / 
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is called the companion matrix of m(x ), denoted by C(m(x)). If T : 
V — > V has minimal polynomial m{x) = q\{x) tx • • • q r (x) tr over F , where 
gi(v), qi(x), • • • , q r (x) are irreducible distinct polynomials in F[jc], then 
there exists a basis 5 of V such that m(T)s is of the form 


(Mi 


M 2 


V 

where each matrix Mi is of the form 


(C(q i(i))"> 


M ( = 




\ 


M r ) 


\ 

C( qi {x)f' r ' / 


(8.5) 


where U =tn >t t 2 > > t in . The matrix (8.5) of T is called the rational 

canonical form of T . 

[Hint. See Corollary, Herstein (1964, p. 308).] 

24. Show that every square matrix is similar to an upper triangular matrix over C. 

25. Let V be an ft -dimensional vector space over F . V is said to be Noetherian (Ar- 
tinian) iff it satisfies acc (bcc) for its subspaces. Show that V is both Noetherian 
and Artinian. 

[Hint. Let U be a proper subspace of V , then dim U < dim V = n. Hence 
any proper ascending (descending) chain of subspaces of V can not contain 
more than n + 1 terms.] 


Exercises A (Objective Type) Identify the correct alternative(s) (there may be 
more than one) from the following list: 

1 . Let V be the vector space of all real polynomials of degree at most 3. Define D : 
V — >► V be the differential operator defined by (D)(f(x)) = f'(x), where / ’ is 
the derivative of /. Then matrix of D for the basis {l,x,x 2 ,x 3 }, considered as 
column vectors, is given by 


/o 

1 

0 

°\ 


0 

0 

0 

0\ 

0 

0 

2 

0 

; (b) 

l 

0 

0 

0 

0 

0 

0 

3 

0 

2 

0 

0 

\0 

0 

0 

V 


(o 

0 

3 

0/ 

/o 

0 

0 

°\ 


/o 

1 

2 

3\ 

0 

1 

0 

0 

; (d) 

0 

0 

0 

0 

0 

0 

2 

0 

0 

0 

0 

0 

v° 

0 

0 

V 


(o 

0 

0 

0/ 
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2. Let V be the vector space of all symmetric matrices of order n x n in > 2) with 
real entries and trace equal to zero. If d is the dimension of V, then d is 


(a) 

(n 2 + n) 

2 

(b) 

{n 2 — 2 ri) 

2 

(c) 

(n 2 + 2n) 

2 

(d) 

(n 2 - n) 

2 


3. For the quadratic form q = x 2 — 6xy + y 2 , 

(a) Rank of q is 3; (b) Signature of q is 1; 

(c) Rank of q is 2; (d) Signature of q is 0. 

4. For a positive integer n , let V n denote the space of all polynomials fix) with 
coefficients in R such that deg f(x)<n, and let B n denote the standard basis of 
V n given by B n = {1, jc, x 2 , . . . , x n }. If T : V3 — > V4 is the linear transformation 
defined by T (f(x)) = x 2 f'(x) + fit) dt and A = ( aij ) is the 5 x 4 matrix 
of T with respect to standard bases £3 and £4, then 


(a) 

<232 = 0 

3 

and a ^=y 

7 

(b) 

«32 = - 

and (233 = 0: 

(c) 

«32=2 

and 

(d) 

032 = 0 

and (233 = 0, 


5. Let M be a 3 x 3 matrix with real entries such that det(M) = 6 and the trace of 
M is 0. If det(M + I) = 0, where I denotes the 3x3 identity matrix, then the 
eigenvalues of M are 

(a) 1, 2, -3 (b) - 1,2, -3 (c) - 1, 2, 3 (d) - 1, -2, 3. 

6. If 



then 


(a) 0 and 3 are the only eigenvalues of M\ 

(b) M is positive semi definite; 

(c) M is not diagonalizable; 

(d) M is not positive definite. 

7. Let the matrix 



AT = 
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have a non-zero complex eigenvalue A. Which of the following numbers must 
be an eigenvalue of Ml 


(a) 20 - A (b) A - 20 (c) A + 20 


(d) -20 -A. 


8. A Jordan canonical form of the matrix 


0 

0 

0 

-4\ 

1 

0 

0 

0 

0 

1 

0 

5 

\0 

0 

1 

o ) 


(-1 


0 0 

0 \ 


t-\ 

1 


0 

0 \ 

0 


1 0 

0 


; (b) 

0 

-1 


0 

0 


0 


0 2 

0 


0 

0 


2 

0 


0 


0 0 

-v 


0 

0 


0 

-2/ 

(\ 

1 

0 

0 ^ 



t-\ 

1 

0 


0 ^ 


0 

1 

0 

0 


(d) 
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1 

0 


0 


0 

0 

2 

0 

9 

0 

0 

2 


0 


v° 

0 

0 

-v 




0 

0 


— 2 / 



9. For a given positive integer n , let M n be the vector space of all n x n real matri- 
ces A = ( [ciij ) such that atj = a rs whenever i + j = r + s (/, j, r, s = 1, . . . , ft). 
Then the dimension of M n , as a vector space over R, is 

(a) In + 1 (b)« 2 — m + 1 (c) n 2 (d)2n-l. 


10. Let V the vector space of all symmetric matrices A = ( atj ) of order n x n 
(n > 2) with real entries, a\\ = 0 and trace zero. Then dimension of V is 

(a) (ft 2 + n — 4)/2; (b) (ft 2 — ft + 3)/2; 

(c) (n 2 + n- 3)/2; (d) (« 2 -«+4)/2. 


11. For a positive integer ft, let Af w (R) be the vector space of all n x n real matri- 
ces and T : M W (R) -> M n (R) be a linear transformation such that T (A) = 0, 
whenever A e M W (R) is symmetric or skew-symmetric. Then the rank of T is 


(a) 


ft(ft + 1) 

2 


(b) 0 (c) ft 


(d) 


ft(ft — 1) 

2 


12. If 5 : R 3 R 4 and T : R 4 R 3 are two linear transformations such that T o S 
is the identity transformation of R 3 , then 

(a) S oT is surjective but not injective; 

(b) S oT is injective but not surjective; 
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(c) S o T is the identity map of R 4 ; 

(d) S o T is neither injective nor surjective. 

13. If V is a 3-dimensional vector space over the field F 3 = Z 3 of three elements, 
then the number of distinct 1 -dimensional subspaces of V is 

(a) 13 (b) 15 (c) 9 (d) 26. 

14. Which of the following statements is (are) correct? 

(a) The eigenvalues of a unitary matrix are all equal to 1 ; 

(b) The eigenvalues of a unitary matrix are all equal to zb 1 ; 

(c) The determinant of real orthogonal matrix is always ± 1 ; 

(d) The determinant of real orthogonal matrix is always — 1 . 

15. Let V be a vector space of dimension d < 00 , over R. Let U be a vector sub- 
space of V . Let S be a non-empty subset of V . Which of the following state- 
ments is (are) correct? 

(a) If S is a basis of V , then U Pi S is a basis of U ; 

(b) If U IT S is a basis of U and {s + U eV/U : s e £} is a basis of V/U, then 
S is a basis of V ; 

(c) If dim U = n, then dim V/U is d — n ; 

(d) If S is a basis of U as well as V, then the dimension of U is d. 

16. Let V be the inner product space consisting of linear polynomials, / : [0, 1] — >► 
R (i.e., V consists of polynomials of the form f(x) = ax+b,a,be R), with the 
inner product defined by (/, g) = f(x)g(x) dx for /, g e V . An orthogonal 
basis of V is then 

(a) {1,4; (b) 

(c) {l, (2x — 1)V3}; (d) {1,W3}. 

17. If /(v) is the minimal polynomial of the 4x4 matrix 

0 0 0 1 
10 0 0 
0 10 0 
0 0 10 

then the rank of the 4x4 matrix f(M) is 

(a) 0 (b) 4 (c) 2 (d)l. 

18. Let a, b,c be positive real numbers such that b 2 + c 2 < a < 1 and M be the 
matrix given by 




M = 


1 b c 
b a 0 
c 0 1 
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Then 

(a) M can have a positive as well as a negative eigenvalue; 

(b) all the eigenvalues of M are positive numbers; 

(c) all the eigenvalues of M are negative numbers; 

(d) eigenvalues of M can be non-real complex number. 

19. If / : R n -* R is a linear map such that /( 0, . . . , 0) = 0, then the set 
{/(* l ,x 2 ,...,x n ): E'/=i xj < 1} is 

(a) [0, a] for some a e R, <2 > 0; 

(b) [0, 1]; 

(c) [—a, a] for some flGR,a>0; 

(d) [a, b] for some a, b e R, 0 < a < b. 

20. Let the system of equations 


x + y + z = l 
2x + 3y — z = 5 
x + 2y — kz = 4 

where & e R, have an infinite number of solutions. Then the value of k is 

(a) 2 (b) 1 (c) 0 (d) 3. 

21. Let A = (< aij ) be an n x n complex matrix and let A* denote the conjugate 
transpose of A. Which of the following statements are necessarily true? 

(a) If tr(A*A) / 0, then A is invertible; 

(b) If A is invertible, then tr(A*A) / 0 i.e., the trace of A*A is non-zero; 

(c) If tr(A*A) = 0, then A is the zero matrix; 

(d) If | tr(A*A)| < n 2 , then | | < 1 for some i, j . 

22. For a given positive integer n, let V be an ( n + 1) -dimensional vector space over 
R with a basis B = {e i,e 2 , , e n +\}. If T : V — >► V is the linear transformation 
such that T {e{) = et+ \ for i = 1, 2, . . . , n and T (e n +\) = 0, then 

(a) nullity of T is 1 ; 

(b) rank of T is n ; 

(c) trace of T is non-zero; 

(d) T n = T o T o • • • o T is the zero map. 

23. Let M be a 3 x 3 non-zero matrix with the property M 3 = 0. Which of the 
following statements is (are) false? 

(a) M has one non-zero eigenvector; 

(b) M is similar to a diagonal matrix; 

(c) M is not similar to a diagonal matrix; 

(d) M has three linearly independent eigenvectors. 

24. If T is a linear transformation on the vector space R n over R such that T 2 = XT 
for some X e R, then 



350 


8 Vector Spaces 


(a) T = XI where I is the identity transformation on R 77 ; 

(b) if || Tx\\ = ||x || for some non-zero vector x e R 77 , then X = d=l; 

(c) || Tv || = | A. || |x|| for all x e R 77 ; 

(d) if || Tv || > ||x|| for some non-zero vector x eR", then T is necessarily 
singular. 

25. Let M be the 3x3 real matrix all of whose entries are 1. Then: 

(a) M is positive semidefinite, i.e., (Mi, x) > 0 for all x e R 3 ; 

(b) M is diagonalizable; 

(c) 0 and 3 are the only eigenvalues of A ; 

(d) M is positive definite, i.e., (Mi, x ) > 0 for all x E R 3 with i^O. 

26. Let T : R 7 — > R 7 be the linear transformation defined by T(x \ , X 2 , . . . , X6, xq) = 
(x 7 , X6, . . . , X 2 , x\). Which of the following statements is (are) correct? 

(a) There is a basis of R 7 with respect to which T is a diagonal matrix; 

(b) T 1 = /; 

(c) The determinant of T is 1 ; 

(d) The smallest n such that T n — I is even. 

27. Let A be an orthogonal 3x3 matrix with real entries. Which of the following 
statements is (are) false? 

(a) The determinant of A is a rational number; 

(b) All the entries of A are positive; 

(c) d(Ax, Ay) = d(x,y) for any two vectors x and y e R 3 , where d(u,v) de- 
notes the usual Euclidean distance between vectors u and teR 3 ; 

(d) All the eigenvalues of A are real. 

28. Which of the following matrices is (are) non-singular? 

(a) Every symmetric non-zero real 3x3 matrix; 

(b) I + A where A^ 0 is a skew- symmetric real n x n matrix for n >2; 

(c) Every skew- symmetric non-zero real 5x5 matrix; 

(d) Every skew- symmetric non-zero real 2x2 matrix. 

29. Let A be a real symmetric n x n matrix whose only eigenvalues are 0 and 1. 
Let the dimension of the null space of A — I be m. Which of the following 
statements is (are) false? 

(a) The characteristic polynomial of A is (X — l) m X m ~ n ; 

(b) The rank of A is n — m ; 

(c) A k = A k+l for all positive integers k\ 

(d) The rank of A is m . 

30. Let (jo be a complex number such that ® 3 = i , but co ^ 1 . Suppose 

/ 1 CD CD 2 \ 

M = I cd cd 2 Il- 
ya; 2 CD 1 / 
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Which of the following statements is (are) correct? 

(a) 0 is an eigenvalue of Af ; 

(b) rank(Af) = 2; 

(c) Af is invertible; 

(d) There exist linearly independent vectors v,w e C 3 such that Mv = 
Mw = 0. 

31. Af = (dij) nxn is a square matrix with integer entries such that ciij = 0 for i > j 
and ciij = 1 for i = 1, . . . , n. Which of the following statements is (are) correct? 

(a) Af -1 exists and it has some entries that are not integers; 

(b) Af _1 exists and it has integer entries; 

(c) Af -1 is a polynomial function of Af with integer coefficients; 

(d) Af -1 is not a power of Af unless Af is the identity matrix. 

32. Let Af be a 4 x 4 real matrix such that — 1 , 1 , 2, —2 are its eigenvalues. Suppose 
B = Af 4 — 5 Af 2 + 57, where 7 denotes the 4 x 4 identity matrix. Which of the 
following statements is (are) correct? 

(a) trace of (Af — B) is 0; 

(b) det(fl) = 1; 

(c) det (Af + B) = 0; 

(d) trace of (Af + B) is 4. 

33. Let Af be a 2 x 2 non-zero complex matrix such that Af 2 = 0. Which of the 
following statements is (are) correct? 

(a) PM P~ l is a diagonal matrix for some invertible 2x2 matrix P with en- 
tries in R; 

(b) Af has only one eigenvalue in C with multiplicity 2; 

(c) Af has two distinct eigenvalues in C; 

(d) Mv = v for some v e C 2 , v ^ 0. 

34. Let Af 2 (R) denote the set of 2 x 2 real matrices. Let A e Af 2 (R) be of trace 2 
and determinant —3. Identifying Af 2 (R) with R 4 , consider the linear transfor- 
mation T : Af 2 (R) — > Af 2 (R) defined by T (Af) = AM. Which of the following 
statements is (are) correct? 

(a) T is invertible; 

(b) 2 is an eigenvalue of T ; 

(c) T is diagonalizable; 

(d) T(B) = B for some 0 / B in Af 2 (R). 

35. Let A , fi be two distinct eigenvalues of a 2 x 2 matrix Af . Which of the following 
statements is (are) correct? 

(a) M 3 = + MV; 

(b) M 2 has distinct eigenvalues; 

(c) trace of M n is X n + fi n for every positive integer n ; 

(d) M n is not a scalar multiple of identity for any positive integer n. 
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36. Let T be a non-zero linear transformation on a real vector space V of dimen- 
sion n. Let the subspace Vo C V be the image of V under T . Let k = dim Vo <n 
and suppose that for some AeR ,T 2 = XT. Then 

(a) A = 1; 

(b) A is the only eigenvalue of T ; 

(c) det M = | A | , where M is a matrix representation of T ; 

(d) there is a non-trivial subspace V\ C V such that Tv = 0 for all x e V\ . 

37. Let M be a n x n real matrix and V be the vector space spanned by {/, M, 
M 2 , ... , M ln ). Then the dimension of the vector space V is 

(a) at most n (b) n 2 (c) 2 n (d) at most 2 n. 

38. Let A and B be two real square matrices of order n. Then 

(a) trace(A + B) < trace(A) + trace(£); 

(b) trace(A + B) > trace(A) + trace(£); 

(c) trace(A + B) = trace(A) + trace(£); 

(d) trace(A + B) / trace(A^) + traced). 

39. Let A and B be two real square matrices of order n. Then 

(a) trace(AA ? ) = sum of all entries of A; 

(b) trace(AA0 = sum of all diagonal entries of A; 

(c) trace(A£) = trace(£A); 

(d) trace(AA0 = 0 does not imply that A is a null matrix. 

Exercises B (True/False Statements) Determine the correct statement from the 
following list: 

1. Cardinalities of all bases of a vector space are the same. 

2. If V is a finite-dimensional vector space and U is a subspace of V with dim V = 
dim U, then U may not be same as V. 

3. If V is a non-zero vector in R 2 , then the subspace V = {rt; : r e R} represents 
geometrically a straight line passing through the origin in R 2 . 

4. Let M 2 (R) be the vector space of all 2 x 2 matrices over R and S be the set of 
all symmetric matrices in M 2 (R). Then dim S = 4. 

5. The map T : R 2 — >► R 1 defined by T(x,y) = sin(x + y) is a linear transforma- 
tion. 

6. Let V be an n -dimensional vector space and T : V — >► U be linear transforma- 
tion. Then T is injective iff rank of T —n. 

7. The real vector space C([0, 1]) is a commutative real algebra with identity. 

8. Let A be an algebra with identity 1 over a field F . Then A is isomorphic to a 
subalgebra of the algebra L(V,V ) of all linear transformations from V to V. 

9. Let A be an n x n matrix over the field F . Then the row rank of A is equal to 
the column rank of A. 

10. Let T : R 2 — >► R 2 be the rotation of the plane through an angle 0, 0 < 0 < n. 
Then T has no eigenvector. 
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1 1 . The matrix 


is diagonalizable over C but not over R. 

12. The matrix 


is similar to the Jordan matrix 



13. The orthogonal complement of the subspace V generated by the vector v = 
(2, 3,4) in R 3 is the plane 2x + 3y + 4z = 0 in R 3 passing through the origin 
and perpendicular to the vector v. 

14. If U = {(v, y, 0) g R 3 }, then for any v = (r, t , s) G R 3 , the coset v + U in R 3 
represents geometrically the plane parallel to xy-plane passing through the point 
(r, t, s) and at a distance |s| from the xy-plane. 

15. For the quadratic form q(x) = x\ — 6 x 1 x 2 + x\, rank is 2 and signature is 1. 

16. The matrix A — ( * ^ ) is positive definite but the matrix B = ( 1 * ) is n °t positive 
definite. 

17. A symmetric matrix M is positive definite iff there exists a nonsingular ma- 
trix P such that M — P r P . 

18. Let V be an n -dimensional inner product space over R. If T : V — >► V is positive 
definite, then all eigenvalues of T are not necessarily positive. 


8.14 Additional Reading 

We refer the reader to the books (Adhikari and Adhikari 2003, 2004, 2007; Artin 
1991; Birkhoff and Mac Lane 2003; Chatterjee 1966; Hazewinkel et al. 2011; Her- 
stein 1964; Hoffman and Kunze 1971; Hungerford 1974; Janich 1994; Lang 1986; 
Simmons 1963), for further details. 
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Chapter 9 

Modules 


The concept of a module arose through the study of algebraic number theory. It 
became an important tool in algebra in the late 1920s essentially due to insight of 
E. Noether, who was the first mathematician to realize its importance. A module is 
an additive abelian group whose elements are suitably multiplied by the elements 
from some ring. Modules exist in any ring. One of the most important topics in 
modern algebra is module theory. A module over a ring is a generalization of an 
abelian group (which is a module over Z) and also a natural generalization of a vec- 
tor space (which is a module over a division ring (field)). Many results of vector 
spaces are generalized in some special classes of modules, such as free modules and 
finitely generated modules over PID. Modules are closely related to the representa- 
tion theory of groups. One of the basic concepts which accelerates the development 
of the commutative algebra is the module theory, as modules play a central role in 
commutative algebra. Modules are also widely used in structure theory of finitely 
generated abelian groups, finite abelian groups and PID, homological algebra, and 
algebraic topology. In this chapter we study the basic properties of modules. We also 
consider modules of special classes, such as free modules, modules over PID along 
with structure theorems, exact sequences of modules and their homomorphisms, 
Noetherian and Artinian modules, homology and cohomology modules. Our study 
culminates in a discussion of the topology of the spectrum of modules and rings with 
special reference to Zariski topology and schemes. In this chapter we also show how 
to construct isomorphic replicas of all homomorphic images of a specified abstract 
module. 


9.1 Introductory Concepts 

The notion of a module is a generalization of the concept of a vector space; where 
the scalars are restricted to lie in an arbitrary ring (instead of a field for a vector 
space). So, naturally, modules and vector spaces have some common properties but 
they differ in many properties. 
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Definition 9.1.1 Let R be a ring. A (left) /^-module is an additive abelian group M 
together with an action (called scalar multiplication) p : R x M — > M (the image 
x) being denoted by rx) such that Vr,seR and x,y e M\ 

(i) r(x + y) = rx + ry ; 

(ii) (r + s)x =rx + sx\ 

(iii) r(sx ) = ( rs)x . 

Moreover, if R has an identity element 1, 

(iv) if lx = x Vv G M, then M is said to be a unitary (left) /^-module. 

If R is a division ring, then a unitary (left) R -module is called a (left) vector 
space over R . 

If R = Z, then an 7? module is merely an additive abelian group. 

A ring R itself can be considered as an R -module by taking scalar multiplication 
to be the usual multiplication of R. 

Remark If we write /x(r, x) as f r (x), then f r : M — > M, x i-> rx is a group homo- 
morphism by (i) for each r e R. 

If End(M) is the endomorphism ring of the additive abelian group M, then the 
map if/ : R — > End(M), r t-^ f r is a ring homomorphism by (ii) and (iii). 

Analogously, a (right) R -module or a unitary (right) R -module or a (right) vector 
space is defined by an action M x R — >► M denoted by (v, r) t-^ xr. 

Proposition 9.1.1 M be a (left) R-module with additive identity Om • TTien 

(i) Otfjt =rO M = 0 M VreR and x e M\ 

(ii) (— r)x = — (rx) = r(— jc) Vr e R and x e M, 

where Or is the additive identity of the ring R. 

Proof Trivial. □ 

Remark A given additive abelian group M may have different R -module structures 
(both left and right). If R is commutative, then every left R -module M can be given 
the structure of a right R -module by defining 

xr — rx Vr e R, x e M. 

From now on, unless specified otherwise, the ring R will mean a commutative ring 
with identity 1 and R -module will mean a unitary (left) R -module. 

Example 9.1.1 (i) Every additive abelian group G is a unitary Z-module with nx 
(n e Z, v g G) defined by 

\ x -f x -f • • • -f X 
0 


nx = 


—x — x — • • • — v 


(n times if n >0) 
(if n = 0) 

(\n\ times if n < 0). 
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(ii) An ideal I of a ring R is an R -module with ra (r e R, a e I) being the usual 
product in R. In particular, R is itself an R -module. 

(iii) If R is a division ring F , then an R -module is called an F -vector space. 

(iv) If M is a ring and R is a subring of M, then M is an 7?-module with rx 
(r e R, x e M) being the usual multiplication in M. In particular R{xJ and R[x] 
are R -modules. 

(v) If I is an ideal of R , then I is an additive subgroup of R and R/ 1 is an abelian 
group. 

Clearly, R/I is an R -module with r(t + I) = rt + I Vr, t g R. 

(vi) Let (M, +) be an abelian group and End(M) be the ring of all endomor- 
phisms of (M, +). Taking R = End(M), M becomes an R -module under the exter- 
nal law of composition /jl : R x M -> M, defined by 

/x(/, x) = f - x = fix), Vf e R and Vv g M. 

(vii) If R is a unitary ring and n is a positive integer, then the abelian group 
(. R n , +) (under componentwise addition) is a unitary R -module under the external 
law of composition /x : R x > R n , defined by 

fjb(r, x) = r • x = (rx i, rv 2 , . . . , r^ w ), Vr e R and Vv = (xi, X 2 , • • • , x n ) € R n . 

In particular, if R = F is a field, then F n is an T 7 -vector space. 

(viii) Let R be a ring and (R N+ , +) denote the abelian group of all mappings 
/ : N + — > 7? (i.e., of all sequences of elements of R) under addition defined by 
(/ + g)(ri) = f(ri) + g(ri), Vn g N + . Then 7? N+ is an /^-module under the external 
law of composition /x : R x R n+ 7? n+ , defined by 

(/x(r, /))(n) = (r • f)(n) = r/(n), Vn G N + . 

(ix) Let P n (x) denote the set of all polynomials of degrees <n with real coeffi- 
cients, then P n (x) becomes a real vector space. 

(x) Let M be a smooth manifold. Then C°°(M) = {/ : M — >► R such that / is 
a smooth function} forms a ring. Then the set S of all smooth vector fields on M 
forms a module over the ring C°°(M). 

(xi) Let V be an n -dimensional vector space over F and T : V — >► V be a fixed 
linear operator. Then V is an Efr] -module under the external law of composition 

F[x] xV^V, (/, v) /(7>. 

We recall the definition of an algebra. 

Definition 9.1.2 An algebra consists of a vector space V over a field F together 
with a binary operation of multiplication on the set V of vectors, such that Vr g F 
and v, y, z e V the following conditions are satisfied: 

(i) (rx)y = r(xy) = x(ry); 

(ii) (x + y)z = xz + yz\ 

(iii) x(y + z) =xy +xz. 
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If, moreover, (iv) ( xy)z = x(yz), Vx, y, z G V, then V is said to be an associative 
algebra over F. 

If dim^(y) —n as a vector space over F , then the algebra V is said to be n- 
dimensional. 

Example 9.1.2 (i) Let G be a field extension of a field F. Then F is a subfield 
of G and V = G is an associative algebra over A, where addition and multiplication 
of elements of V are the usual field addition and multiplication in G, and scalar 
multiplication by elements of F is again the usual field multiplication in G. 

(ii) Let V be a vector space over a field F. Then the set L(V,V) of all linear 
transformations of V into itself is an algebra over F. 

Definition 9.1.3 For a commutative ring R with identity, an R -algebra (or algebra 
over R) is a ring K such that 

(i) ( K , +) is a unitary (left) /? -module; 

(ii) r(xy) = (rx)y = x(ry ), Vr e R, x, y e K. 

Example 9.1.3 (i) Every ring R is a Z-algebra. 

(ii) Let R be a commutative ring with 1, then 

(a) the group ring R(G) is an TGalgebra with -module structure r(J2 a iXi) = 
^2(rat)xi, Vr, at e R and x; e G (see Ex. 10 of Exercises-I, Chap. 4). 

(b) M n (R ), the ring of all square matrices of order n over R, is an R -algebra. 


9.2 Submodules 

We are interested to show how to construct isomorphic replicas of all homomorphic 
images of a specified abstract module. For this purpose we introduce the concept 
submodules which plays an important role in determining both the structure of a 
module Af and nature of homomorphisms with domain M. 

Let M be an R -module. Then (Af, +) is an abelian group. We now consider its 
subgroups (N, +) which are stable under the external law of composition defined 
on the R -module M. 

Definition 9.2.1 Let M be an R -module. A non-empty subset N of M is called an 
R -submodule or (simply) a submodule of M iff 

(i) for x, y e N, x — y e N (i.e., (A, +) is a subgroup of (Af , +)); 

(ii) for x g N, r e R, rx e N (i.e., N is stable under the external law of composition 
on Af). 

Thus a submodule A of Af is closed under both restricted compositions of ad- 
dition and scalar multiplication (defined on Af) on A. Clearly, {Om} and Af are 
submodules of Af in their own right, which are called trivial submodules of Af . 
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Remark { Om } is the smallest submodule of M and M is the largest submodule 
of M. If there exist any other submodules of M, then they are called non-trivial 
submodules of M. Any submodule of M other than M is called a proper submodule 
of M. 

Theorem 9.2.1 Let M be a unitary R-module. Then a non-empty subset N of M is 
a submodule of M iff for all a,b e R and x, y e A , ax + by e A. 

Proof Let A be a submodule of the unitary R -module M. Then for a,b e R and 
x, y e A, ax e A and (—by) e A and hence ax + by = ax — (—(by)) e A. 

Conversely, let A be a non-empty subset of M such that for all a, b e R and x, 
y e A, ax +by e A. Then 1r and — 1^ e R =>► Irx + (— l#)y e A =>► x — y e A =>► 
(A, +) is a subgroup of (M, +). Again Or e R => ax + ORy = ax e A => A is 
stable under external law of composition. Consequently, A is a submodule of M. □ 

Corollary Let M be a unitary R-module. Then a non-empty subset A of M is a 
submodule of M iff 

(i) x,yeN^x + yeN and 

(ii) aeR,xeN^axeN. 

Example 9.2.1 (i) If (G, +) is an abelian group, then G is a Z-module by Exam- 
ple 9.1 . l(i). The submodules of the Z-module G are precisely the subgroups of G. 
In particular, E = {0, ±2, ±4, . . .} is a submodule of the Z-module Z. 

(ii) Every ring R may be considered as a left R -module by Example 9.1.1(h). 
Let I be a submodule of R. Then I c R is such that x — y e I and rx e /, Vv, y e I 
and Vr e R by Definition 9.2.1. Consequently, I is a left ideal of R. Conversely, 
let I be a left ideal of R. Then I c R is such that x — y e I and rx e /, Vjc, y e I 
and Vr e R. Thus I is a submodule of the left R -module R. Hence the submodules 
of the left R -module R are just the left ideals of R. Consequently, the submodules 
of the left R -module R are precisely the left ideals of R. 

Likewise, considering R as a right R -module, the submodules of R are precisely 
the right ideals of R. 

(iii) Submodules of a commutative ring are precisely its ideals. 

Most of the operations considered for groups have their counterparts for modules. 

Theorem 9.2.2 Let M be an R-module and {M/}/ e / be a family of submodules 
of M. Then their intersection H/e/ is again a submodule of M. 

Proof Let A = H/e/ • Now Om € M z - for each i e /, since every submodule Mi 
is a subgroup of M =>► Om e A=^A/0. 

Again x — y e Mi and rx e Mi for each i e I and Vv, y e Mi,Vr e R => x — y, 
rx e A =>- A is a submodule of M. □ 
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Remark The union of two submodules of an R -module M is not in general a sub- 
module of M. The reason is that union of two subgroups of a group is not in general 
a group. We now cite the following examples. 

Example 9.2.2 (i) Let M x = {0, ±2, ±4, ±6, . . .} and M 2 = {0, ±3, ±6, ±9, . . .}. 
Then M\ and M 2 are both submodules of the Z-module Z. Now 3 g M\ U M 2 and 
2 g M\ U M 2 but 3+2=5^ M\ U M 2 . Hence M\ U M 2 cannot be a submodule of 
the Z-module Z. 

(ii) Let Mi = {(jc, 0) e R 2 } and M 2 = {(O. v) e R 2 }. Then (1,0) e Mi and 
(0, 1) g M 2 but (1,0) + (0, 1) = (1, 1) ^ M\ U M 2 =>► Mi U M 2 is not a subgroup of 
R 2 =>► Mi U M 2 cannot be a submodule of the Z-module R 2 . 

Let M be an R -module. If M x and M 2 are submodules of M, then M x D M 2 is 
the largest submodule of M contained in both M i and M 2 . 

The Theorem 9.2.3 leads to determine the smallest submodule containing a given 
subset S (including the possibility of S = 0) of an R -module M. 

Definition 9.2.2 Let M be an /^-module and S be a subset of M. Then the submod- 
ule generated or spanned by S denoted by (S) is defined to be the smallest submod- 
ule of M containing S , i.e., (S) is the submodule of M obtained by the intersection 
of all submodules Mi of M containing S'. 

To determine the elements of (S), we introduce the concept of linear combina- 
tions of elements of S as defined in vector spaces. 

Definition 9.2.3 Let M be an R -module and S / 0 be a subset of M. Then an 
element x e M is said to be a linear combination of elements of S iff 3x \ , x 2 , . . . , 
x n e S and r x , r 2 , . . . , r n e R such that 


n 



(9.1) 


i = 1 


Let C(S) denote the set of all linear combinations of elements of S in the form (9.1). 

Theorem 9.2.3 Let M be a unitary R -module and S be a subset of M. Then the 
submodule ( S ) generated by S is given by 



where 


C(S) = { £ 

' flnitp ci 


finite sum 


H G R, Xi G S [ . 


Proof If S = 0, then the smallest submodule of M containing S' is the zero submod- 
ule {Om}- Hence (5) = {Om} if 5 = 0. Next suppose that S 0. Then C(S) 0, 
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since Ir e R and for x e S, x = Irx e C(S). Hence S We claim that C(S ) 

is a submodule of M. Let xje C(S). Then x = Y7i= 1 r i x i and J = Y77= l f° r 
some r z , t z g 7? and x z , y z g S. Hence x + y = Y7 a=\ r i x i + Y77=i hyt e and 

rx = r(rixi + r 2 x 2 H b r^x,*) = (rr\)x\ + (rr 2 )x 2 H b (. rr n )x n eC(S) => 

C(S ) is a submodule of M containing S. Finally, let N be a submodule of M such 
that S ^ N. If x g C( S), then x can be represented as x = Y7i = l r i x i > r i e x i e S- 
Now x z - e S =>► x z - e TV => riXi e N, Vi => Y7i=i r i x i G since TV is a submodule 
of M x g A Vx g C(S) C(S) c N C(S) is the smallest submodule of M 
containing S. Again (5) is also the smallest submodule of M containing S'. Conse- 
quently, (S) = C(S), if S / 0. □ 

Remark If the ring R does not contain the identity element, then 

(S) = | ( n i + r i) x i if S ^ 0, where n z - e Z, r z g 7? and x z g S 1 . 

1 finite sum J 

Definition 9.2.4 An R -module M is said to be generated by a subset S iff M = (S) 
and S is said to be set of generators of M. 

In particular, if S = {x\, x 2 , . . . , x n } is a finite subset of M such that M = (S), 
then M is said to be finitely generated by S and hence if M is a unitary R -module, 
then 

M = : r < e ^J- 

If S = {jc}, then ({x}) = (x), the submodule generated by {x} is given by 

| {rx : r g 7?} if M is a unitary R -module 

{rx + nx : r g R, n e Z}, otherwise. 

(x) is called the cyclic submodule of M generated by x G M. 

Thus a unitary R -module M is finitely generated iff M = M\ + M 2 H b M n , 

where each M z is cyclic i.e., M z = 7?x z , i = 1, 2, . . . , n and M\ + M 2 H + M n is 

the sum of submodules of M (see Definition 9.2.10). Then {xi, x 2 , . . . , x n } is a set 
of generators for M. 

Definition 9.2.5 Let M be an R -module. A subset S of M is said to be linearly 
dependent over R iff there exist distinct elements xi, x 2 , . . . , x n e S and elements 

r \ , r 2 , . . . , r n (not all zero) in R such that r\x\ +r 2 x 2 H br^x^ = Om • Otherwise, 

S is said to be linearly independent over R. 

Remark (i) {Om} and any subset S (of M) containing Om are linearly dependent 
over R. 

(ii) If S is linearly dependent over R and T is any subset of M such that S C r, 
then T is also linearly dependent over R , i.e., any subset containing a linearly de- 
pendent set is also linearly dependent. 
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(iii) If S is linearly independent over P and T is any subset of M such that T c S, 
then T is also linearly independent over R , i.e., any subset contained in a linearly 
independent set is also linearly independent. 

Definition 9.2.6 Let M be an P-module. A submodule A (^M) of M is said to 
be maximal iff for a submodule P of M such that A c P c M, either P = A or 
P = M, i.e., there is no submodule P of M satisfying A C P C M . 

Definition 9.2.7 A submodule A (^{Om}) of M is said to be minimal iff for a 
submodule P of M such that P c A, either P = {Om} or P = A, i.e., the only 
submodules of M contained in A are {Om} and A. 

Definition 9.2.8 A module M (^{Om}) is said to be simple iff the only submodules 
of M are {Om} and M. 

Theorem 9.2.4 Let M be a unitary R-module. Then M is simple iff for every non- 
zero element x e M, M = Rx = {rx : r e P}, i.e., iffM is generated by {x} for every 
x ^ Om in M. 

Proof Let M be a simple unitary P-module. Then for x (yfi O m ) inM,x = l^iG 
Rx ^ Rx 0. Next let rx, £x e Px, where r,t e R. Then (rx + £x) = (r + 0* £ 
Px and r{tx) = ( rt)x e Rx. Consequently, Rx is a submodule of M. Since for x / 
Om , x = l/?x e Px, Px 7 ^ {Om}. Again, M being a simple P-module, Px = M. 
Conversely, let Rx = M for every non-zero x e M. Suppose A / {Om} is a sub- 
module of M. Then 3 a non-zero element x in A such that Px c A i.e., MCA. 
Since ACM, it follows that A = M. Consequently, M is a simple P-module. □ 

Corollary 1 If R is a unitary ring , /7z£n P A <3 simple R-module iff R is a division 
ring. 

Proof Let P be a unitary ring. Then P is a unitary module over itself. Hence by 
Theorem 9.2.4, P is a simple P-module iff P = Px for every non-zero x e R. 
Again Ir e P = Px => Ir = yx for some yeP=^x has a left inverse in P. Thus 
it follows that every non-zero x in P has an inverse in P (see Proposition 2.3.2 of 
Chap. 2). Consequently, P is a division ring. Conversely, let P be a division ring. 
Then {Om} and P are only ideals of P and these are the only submodules of the 
P-module P. Hence P is simple. □ 

Corollary 2 Let V be a vector space over a field F . Then for every non-zero x e V , 
the subspace U x = [rx : r e F} is simple. In particular ; F -vector space F is 

simple. 

Theorem 9.2.5 M 7 ^ {Om} a finitely generated R-module. Then each proper 
submodule of M is contained in a maximal submodule of M. 
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Proof Similar to the proof of Theorem 5 . 3 . 6 . □ 

Definition 9.2.9 Let M, N be (left) R -modules. Then their cartesian product M x 
N is a (left) R -module under addition and scalar multiplication defined in the usual 
way: 


(x,y) + (s,t) = (x + s,y + t) and 

r(x, y ) = (rx, ry) V(x, y), (s, t) e M x N and Vr e R. 

The R -module M x N is called the direct product of R -modules M and N. 

More generally, if {M Z } ZG / is any family of (left) /^-modules, then M = rie/M, 
is a (left) R -module in the usual way: 

RxM^M, (r, (Xj) ie /) h-> (rxi) ie /. 

The R -module M = is called the direct product of {M Z } ZG / . 

In particular, R 2 = R x R, . . . , R n = R x R x - x R (n factors), R°° = YV\° Ri • 


Definition 9.2.10 Let M be an R -module and Afi, N2 be submodules of M. Then 
N\ + N2 = {n\ + ft 2 • n\ e N\, ri2 e N2} is called the sum of the submodules N\ 
and N2- 

Proposition 9.2.1 Let M be an R -module and N\ , N2 submodules of M. Then N\ + 
N2 is the submodule generated by N\ U^. 

Proof Om e N\ + N2 => N\ + N2 ^ 0 . Let x, y e N\ + N2. Then there ex- 
ist n\,n3 e N\ and n2, n<\ e N2 such that x = n\ + n2 and y = n^ + n^. Now 
x + y = (n\ + n2) + (ft 3 + ^4) = (fti + ft 3) + (ft 2 + ^4) (by commutativity of + in 
M) andrx = rn\ -\-rn2 e N\ + N2 Vx, y e N\ + A2 and Vr e 7 ?. Hence N\ + A2 is a 
submodule of M. We claim that N\ U N2 c Afi + A2 and A^i + A2 is the smallest sub- 
module of M containing N\ U N2 . Again fti E N\ => n\ — n\ + On 2 e Afi + N2 => 
N\ c Ai + Afc. Similarly, A2 c Afi + Afc. Consequently, Afi U A2 c Ah + Afc. 
Let A be the submodule of M generated by N\ U N2 and let rzi + n2 e Afi + A2, 
where fti e Afi and n2 € N2. Hence N\U N2^ N =>ni,ri2 € N =>n\-\-n2 £ N => 
N\ + A2 c N => N\ + A2 is the smallest submodule of M containing N\ U Afc. 
Consequently, (N\ U N2) = N\ + N2. □ 

Corollary If N\ and N2 are submodules of M, then N\ + N2 is the smallest sub- 
module of M containing both N\ and Afc. 

The concept of sum N\ + N2 of two submodules of M can be generalized for 
any family {A Z } ZG / of submodules of M\ 

y j Ni = | Xi : Xi E N[ , Xi = Om except for finitely many i’s \ . 
iel iel ' 

This is a submodule of M containing each N [ , i el. 
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The submodule ^f ieI Nj is called the sum of the family of submodules {Ni}i e j. 
In particular, if I = {1,2, then the submodule YTi=\ Ni = Ah + N 2 + 

h N n is called the sum of n submodules N \ , N 2 , . • . , N n of M. 

We now proceed to define direct sum of two submodules of an R module in two 
different equivalent ways. 

Definition 9 . 2.11 Let M and N be (left) R -modules. Then P = M x N is again a 
(left) R -module. The direct sum of the modules M and N denoted by M ® N is 
defined by 

M®N = {(x, Off) + (0 M ,y) :x eM,yeN} = {(x, y) e P :x e M,y e N}. 

Definition 9 . 2.12 Let M be an R -module and A, B be two submodules of M. Then 
M is said to be the direct sum of A and B denoted by M = A(B B iff M = A-\- B and 
A IT B = {Om}. If M = A 0 B, then A and B are called the direct summands of M. 

Definition 9.2.12 leads to the following proposition. 

Proposition 9 . 2.2 Let M be an R -module and A, B be its two submodules. Then 
M = A © B iff every element x e M can be expressed uniquely as x = a -\-b, a e A, 
beB. 

Proof M = A® B o M = A + B and AHB = { Om }• Thenx e M can be expressed 
as x =a + b, a e A, beB. If x=a + b = c + d, where a,c e A and b,d e B, then 
a — c = d — b ^ a — c e A and d — b e B are such that a — c = d — b e A P\ B = 
{Om} => a = c and d — b. The converse part is similar. □ 

Corollary Let M be an R -module and A, B be two submodules of M. Then A n 
B = {Om} Iff every element x e A + B can be expressed uniquely as x = a + b, 
where a e A and b e B. 

Definition 9 . 2.13 An R -module M is called the direct sum of a family of submod- 
ules {Mi }t e i denoted by M — 0- G/ Mi , iff M — Jf ieI Mi and every element x e M 
can be expressed uniquely as x = ^2i e i^i, %i e M z -, Xi = Om except for finitely 
many Vs. A non-zero module M is said to be semisimple iff it can be decomposed 
into a direct sum of a family of minimal submodules of M. 

Clearly, a local ring is semisimple iff it is a division ring. 

Remark 1 In general, 0- G/ M z c ]0 G/ M z . Equality holds iff I is a finite set. 

Remark 2 Remark 1 shows that Definition 9.2.11 and Definition 9.2.12 are equiv- 
alent. 

Definition 9 . 2.14 A submodule A of an R -module M is said to be a direct summand 
of M iff there exists a submodule B of M such that M = A ® B . Such a submodule 
B of M is called a supplement or complement of A in M. 
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Remark (i) Every subspace of an inner product space is a direct summand. 

(ii) Every submodule of a module is not in general a direct summand. 

(iii) If a submodule of a module M is a direct summand, its supplement in M 
may not be unique. 

(iv) Consider the Z-module Z. A non-zero subgroup (n) of Z is not a direct 
summand, because a supplement which is infinite cyclic should be isomorphic to 
the quotient group Z /(n) (—Z n ), which is not possible. 

(v) Consider the vector space R 2 over R. Let /, m, n be three distinct lines in R 2 
through its origin (0, 0). Then /, m, n are subspaces of R 2 such that both m and n 
are supplements of / in R 2 . Hence m^n =>► supplement of / is not unique. 

Theorem 9.2.6 Let M be an R -module and s(M) be the set of all submodules of 
M. Then s(M) is a modular lattice under set inclusion. 

Proof Let M be an R -module and P, g be submodules of M. Then P + g is the 
smallest submodule of M containing both P and g , i.e., P c P + g and g c 
P + Q. Again P D g is the largest submodule of M contained in both P and g , 
i.e., fflgcp and ffl g c g. Thus in the set s(M) of all submodules of M, 
partially ordered by set inclusion, every pair of elements P and g has P + g as lub 
(or supremum) and P Hi g as gib (or infimum), i.e. their join P v g = P + g and 
meet P A Q = P n g. Hence s(M) is a lattice under set inclusion. Next we claim 
that this lattice is modular. Let P, Q , A be submodules of M such that A c P. 
Then we prove that P Pi (g + A) = (P Pi g) + A. Now ACP=^P + A = P. 
Again, (P n Q) + N c P + A and (P H g) + A c Q + A =>► (P Pi Q) + A c (P + 
A) fi ( g + A) = P fl ( g + A) . To prove the reverse inclusion, let x e P H ( g + A) . 
Then x e P and x e Q + A. Hence 3q e Q and n e A such that x = q + n. Since 
A c P, we have and hence q = x — n e P . Consequently, ^efflg. Hence 
x = q + ft g (P n g) + A. Thus P Pi (g + A) c (P D g) + A. Hence it follows 
that P n (g + A) = (P n g) + A. □ 

Definition 9.2.15 Let M be a (left) R -module and S' be a non-empty subset of M. 
Then the annihilator of S in R denoted by Ann(S) is defined by 

Ann(S) = {r e R : rx = Om , Vv g S}. 

Proposition 9.2.3 Let M be a (left) R -module and S be a non-empty subset of M. 
Then 

(a) Ann(S) is a left ideal of P; 

(b) if S is a submodule of M, then Ann(S) is a two-sided ideal of R. 

Proof (a) Orx = Om , Vv g S =>► Or g Ann(S) =>► Ann(S) 7 ^ 0. Let a, b g Ann(S). 
Then a,b e R and ax = Om = bx Vv g S. Hence (a — Z?)v = Om Vv g S a — Z? g 
Ann(S). Again (r^)v = r(^v) = = Om , Vx e S, Vr e R ra e Ann(S), 

Vr g P. Consequently, Ann(S) is a left ideal of P. 
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(b) Let S be a submodule of M. Then rx e S, Vr e R and Vx e S. Hence for 
a e Ann(S), a(rx) = Om => ( ar)x = Om Vx e 5 =>- ar e Ann(S) =>- Ann(S) is 
also a right ideal of R. Thus A nn(S') is a two-sided ideal of 7? by (a). □ 

Corollary 1 IfM is an R- module , / = Ann(M) ( i.e ., I M = 0) is a two-sided 

ideal of R. 

Corollary 2 If R is a commutative ring with 1 / = Ann(M) A a maximal ideal 

of R, then M is a vector space over the field R/ 1 . 

Remark Let I = Ann(M). Then R/I is a ring, since I is a two-sided ideal of R 
by (b). Define scalar multiplication R/I x M — >► M, given by (r -\- 1) • x = rx. 

Then M is both an R -module as well as an R/I module, but their scalar mul- 
tiplications on M coincide. Thus an 7?-module M is also an R/I- module and vice 
versa. 

Exercise The intersection of all annihilators of all non-zero elements of a simple 
7?-module M is a two-sided ideal of R. 

[Hint. Ann(M) is a two-sided ideal of R and p| o M ^xeM Ann(v) = Ann(M).] 


9.3 Module Homomorphisms 

We continue in this section the study of submodules. 

Structure preserving mappings, called homomorphisms, play an important role 
in group theory as well as in ring theory. It is an essential and basic task to extend 
this concept to module theory. 

Definition 9.3.1 Let M and N be (left) R -modules. A mapping / : M N is 
called an R -homomorphism (or R -morphism) iff 

(i) f(x + y) = f(x) + f(y); 

(ii) f(rx) = rf(x ) Vx, y e M and Vr e R. 

In particular, if R is a field, then an R -homomorphism is as usual called a linear 
transformation of vector spaces. Module homomorphisms of special character, carry 
special names, like special group homomorphisms. 

Definition 9.3.2 Let / : M — >► N be an R -homomorphism for R -modules. Then / 
is called an 

(1) R -monomorphism iff / is injective; 

(2) /^-epimorphism iff / is surjective; 

(3) ^-isomorphism iff / is bijective; 

(4) R -endomorphism iff M = N, i.e., iff / is an R -homomorphism of M into itself; 

(5) R -automorphism iff M = N and / is an 7? -isomorphism of M onto itself. 
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The existence of an ^-isomorphism between R -modules M and N asserts that 
M and N are, in a significant sense, the same, i.e., M and N have the same module 
structures. In that case we say that the R -modules M and N are R -isomorphic and 
write M = N . 

For the time being, if we forget the scalar multiplication on M, then / is con- 
sidered a group homomorphism / : (M, +) — > (A, +). Hence it preserves additive 
zero and additive inverse. More precisely: 

Proposition 9.3.1 If f : M -> N is an R -homomorphism, then /(Om) = O n and 
f(-x) = -f{x), Vx e M. 


Proof /(Om) = f(Oni + Om) = f (O m) + /(Ojf) =>■ /(<?«) = Again 
/(x) + /(-X) = fix + i-x)) = /(Om) = =► /(-X) = -/(x), Vx e M. □ 

Definition 9.3.3 Let / : M — > TV be an -homomorphism. Then kernel of / de- 
noted by ker/ is defined to be the kernel of / as a group homomorphism / : 
(M, +) -> (N, +), i.e., ker / = {x e M : /(jc) = O^} and the image of / denoted 
by Im / is the set Im/ = f(M) = {/(x) : x e M}. 

It follows from the definitions that 

Proposition 9.3.2 Let f : M — >► Af be an R -homomorphism. Then 

(a) ker / is a submodule of M\ 

(b) / is an R -monomorphism iff ker / = { Om }; 

(c) for a submodule A of M, f(A) is a submodule of N; 

(d) Im / = f(M) is a submodule of N. 

Remark (i) A necessary and sufficient condition for an R -homomorphism / : M — > 
N to be an R -monomorphism is that ker / is as small as possible as a submodule of 
M , namely, ker / = {Om}- 

(ii) On the other hand, a necessary and sufficient condition for an R -homo- 
morphism / : M — > N is an 7?-epimorphism is that Im/ is as large as possible 
as a submodule of N, namely, Im / = N. 

Example 9.3.1 (i) Every homomorphism of abelian groups is a Z-module homo- 
morphism. 

To show this let M and N be abelian groups. Then M and N are regarded as 
Z-modules. Let / : M — > N be a group homomorphism. Then f(nx) = nf(x) by 
induction on n e N + . Clearly, f(nx) = n/(x), Vn e Z. Hence / is a Z-module 
homomorphism. 

(ii) Let M be an /^-module. Then the identity mapping Im - M — > M is an R- 
automorphism. 

(iii) Let M be an R -module and n be a positive integer. Then the mapping p t : 
M n — > M , defined by ^(xi, X 2 , . . . , x t , . . . , x n ) = x t is an 7?-epimorphism, called 
the nh projection mapping of M n onto M ; on the other hand, the mapping i t : M — > 
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M n , defined by i t (x) = (0, 0, . . . , x, . . . , 0) is an R -monomorphism, called the ti h 
injection mapping, for t = 1, 2, . . . , n, where x is at the ti h position. 

(iv) The mapping / : M n (R ) — > R, defined by /(A) = trace A, the trace of the 
square matrix A of order n over R, is an R-homomorphism. 

(v) Let S n (R) = {Ag M n { R) : A is symmetric}. Then the mapping / : M n ( R) 
S W (R), defined by /(A) = A + A 1 is an R-epimorphism. 

Like group homomorphisms, we can define the composite of /?-homomorphisms 
having the following properties: 

Proposition 9.3.3 If f : M — > TV g : N ^ P are R -homomorphisms, then their 
composite mapping g o f : M — > P, defined hy (g o /)(x) = g(/(x)), Vx e M is 
also an R-homomorphism satisfying the following properties : 

(a) / a^d g are R-epimorphisms imply g o f is also so; 

(b) / aad g are R-monomorphisms imply g o f is also so; 

( c ) g°/ w aa R-epimorphism implies g is also so; 

(d) go f is an R -monomorphism implies f is also so. 

Proof For x, y e M and r e R, (go f)(x + y) = g(f(x + y)) = g(f(x) + /(y)) = 
g(f(x )) + g(f(y )) = (g o f)(x) + (g o f)(y) and (g o f)(rx ) = g(f(rx)) = 
g(r/(x)) = rg(/(x)) = r(go/)(x). Hence go f : M -> P is an R -homomorphism. 
(a)-(d): Left as exercises. □ 

Proposition 9.3.4 Let M and N be R -modules and f : M -> N be an R- 
homomorphism. Then for any submodule B of N, the set f~ l (B) defined by 

f~ 1 (B) = {x G M : /(x) G B} is a submodule of M. 

Proof f(0 M ) = 0 N eB^O M G f~\B) =► f~ l (B) ± 0. 

Now ije f~ l (B) /(x), /(y) € B =$> f(x — y) = f(x) - f(y) e 5, since 
B is a submodule of iV => r - j e f~ l (B). Similarly, for r e and x e f~ l (B), 
rx e f~ l (B). Consequently, f~ l (B) is a submodule of M. □ 

Proposition 9.3.5 Let f \ M ^ N be an R-homomorphism. 

(a) If A is a submodule of M, then / _1 (/(A)) = A + ker /; 

(b) If B is a submodule ofN, then f(f~ 1 (B)) = B Him/. 

Proof (a) x e A =► /(x) e /(A) =>x e / _1 (/(A)) =>■ A c / _1 (/(A)). Again y e 
ker/ =► /(y) = = /(0 M ) e /(A) =► y e /“*(/ (A)) =► ker / c / _1 (/ (A)). 

Since / _1 (/(A)) is a submodule of M, it follows that A + ker / c f~ l (f(A)). To 
obtain the reverse inclusion, let x e f~ l (f(A)). Then /(x) e /(A) => /(x) = 
/(a) for some a e A /(x — a) = /(x) — f(a) = x — a e ker / =>► 

x e a + ker / c A + ker / =>► / _1 (/(A)) c A + ker/. Hence it follows that 
/ _1 (/ (A)) = A + ker / . 
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(b) a e /(/ 1 (5)) =>- a = /(x) for some v e / 1 ( B ) =^f(x)eB^aeB^ 
c Moreover, Z(Z _1 (*)) c / (Af) => Z(Z _1 W) c Imf. Hence it 
follows that f(f~ l (B )) c B Him/. For the reverse inclusion, let y e B D Imf. 
Then y e B and y elm /. Now y e I m / =>► v = /(m) for some m e M =>- /(m) = 
yeB^me f~ l (B) =» /(m) e f(f~\B)) f(f~\B)) ^5nim/c 

/(/ _1 (£)). Hence it follows that /(/ -1 (£)) = 5 D Im/. □ 

Corollary (a) If f : M ^ N is an R -monomorphism and A is a submodule of M, 
then /->(/ (A)) = A; 

(b) If f : M N is an R-epimorphism and B is a submodule of N , then 
f(f~\B)) = B. 

Theorem 9.3.1 (Schur’s Lemma) Let M, N be R-modules. Then 

(a) M is a simple R -module implies any non-zero R -homomorphism f : M — > TV 
is an R -monomorphism’, 

(b) N is a simple R -module implies any non-zero R-homomorphism f : M — > N is 
an R-epimorphism’, 

(c) M is a simple R-module implies End(M) is a division ring. 

Proof (a) ker / is a submodule of the R -module M. Hence M is simple =>> ker / = 
{Om} or ker / = M. Since / is non-zero, ker / / M. Hence ker / = {Om} => f is 
an R -monomorphism. 

(b) Im / is a submodule of N. Hence N is simple =>► Im / = N or Im / = {O^}. 
Since / is non-zero, Imf / {O^}. Hence Imf = N => f is an /?-epimorphism. 

(c) In particular, let / : M — > M be a non-zero -homomorphism. Hence M 

is simple =>- / is an automorphism by (a) and (b) => f is invertible in the ring 
End (M) =>► End(M) is a division ring. □ 

Definition 9.3.4 Let R be a commutative ring and M be an /^-module. Then for 
each a e R, 3 an R -endomorphism f a : M -> M, defined by f a (x) = ax, called the 
homothecy defined by a. 

Proposition 9.3.6 The set H = {f a : a e R and f a is the homothecy defined by a) 
is a subring of the ring End (M). 

Proof Consider the mapping > End(M), defined by fi(a) = Za, where / a : 

M ^ M is defined by / a (v) = ax. 

Then fi^a + b) = fi(a) + \/s(b) and \l/(ab) = \l/(a)\j/(b), Va, b e R => is a ring 
homomorphism =>► H = is a subring of End(M). □ 

Given an R -module M and an R -endomorphism f : M M, can we express M 

as the direct sum of Im / and ker /? The solution of the following problem gives its 


answer. 
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Problem Let / : M — > M be an R -endomorphism such that / o / = /. Is M = 
Im / ® ker / true? 

Solution ker / and Im / are both submodules of M =>► Im / + ker /CM. We now 
show that Im / 0 ker f = M. For m e M, m = f(m) + (m — f(m)). Again f(m — 
f(m)) = f(m ) - (/ o /)(m) = /(m) - f{m) = 0 M ^tn- f(m ) e ker/ m e 
Im / + ker / =>- M c Im / + ker /. Hence M = Im / + ker /. Again i Glm/fl 
ker / =>► /(x) = Om and e M such that x = f(y). Now f(f(y )) = /(x) = 
Om =>■ (/ o /)(y) = O m ^ f(y ) = O m =»■ x = O m ^ Im/ n ker/ = {O m }- 
Hence M = Im / 0 ker /. 


9.4 Quotient Modules and Isomorphism Theorems 

In this section we continue discussion of submodules and homomorphisms. The 
construction of quotient modules is analogous to that of quotient groups. 

We now prescribe a method of construction for new modules from the old ones. 
The most significant role played by submodules of a module is to yield other mod- 
ules known as quotient modules which are associated with the parent module in a 
natural way. Moreover, we prove isomorphism theorems for modules. 

Let M be an R -module and N be a submodule of M. Then ( N , +) is a normal 
subgroup of the abelian group (M, +) and hence the quotient group ( M/N , +) is 
also an abelian group under the composition +, defined by 

(x + N) + (y + N) = (x + y) + N Vx, y £ M. 

Definition 9.4.1 Given an R -module M and a submodule N of M, the abelian 
group (M/N, +) inherits an R -module structure from M : 

R x M/N -> M/N , defined by 

(r, x 0 A) i — > vx 0 N , i.e., r • (x + N) = rx + N. 

The R -module M/N is called the quotient module of M by N . The natural map n : 
M — > M/N , defined by 7r(x) = x + N Vx e M is an 7?-epimorphism with ker 7 T = 
N and is called the canonical 7?-epimorphism or natural R -homomorphism. 

Theorem 9.4.1 (Correspondence Theorem) Let M be an R -module and N be a 
submodule of M. Then there exists an inclusion preserving bijection from the set 
s(M) of all submodules of M containing N to the set s(M/N) of all submodules 
of M/N. 

Proof Let A e s(M). Then A is a submodule of M such that N c A c M. Again 
A/N = {a + N \ a e A} is a submodule of M/N . We now define a mapping / : 
s(M) — > s(M/N) by 


/(A) = A/N WAes(M). 
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Fig. 9.1 Commutativity of 
the triangular diagram 


f 

M ► N 

\ A 

a \ f 
M/K 


Fig. 9.2 Commutativity of 
the rectangular diagram 


M 

jt N 

M/N 


jt p 


M/P 

JlN/p 

M/P/N/P 


f is injective : Let A, P c s(M) be such that /(A) = /(P). Then A/N = 
B/N => given a e A, 3b e B such that a + N = b -\- N. Hence a — b e A =>► 
a — b = n for some n e A. Thus a = b + n e P, since B is a submodule of M 
such that iVCS. Hence AC5. Similarly 5CA. Thus A = B => / is injective. 

/ is surjective : Let P e s(M/N). Then P is a submodule of M/N . Consider 
P* = {x e M :x + N e P}. Since P / 0, P* / 0. Again Vjc e A, x + A = 0 + 
A g P =>► Vx G A, x g P* =4> A c P*. Clearly, P* is a submodule of M such that 
A c P* => P* g s(M). We claim that /(P*) = P. Now x + A e P*/N x e 
P^i + Ag/ 5 ^ P*/A = P => /(P*) = P =4> / is surjective. 

Consequently, / is abijection. Finally, let A, P e s(M) be such that A c B. Now 
a + N e A/N ^ a e A B ^ a + N e B/N => A/A c P/A =» /(A) c /(P). 
Thus ACP^ /(A) c /(P). □ 

Corollary Any submodule of M/N is of the form A/N for some submodule A of 
M such that A c A. 

Theorem 9.4.2 (Fundamental Homomorphism Theorem or First Isomorphism The- 
orem) Let M and A be R-modules and f : M — > A be an R -homomorphism 
with K = ker /. Then there exists a unique R-isomorphism f : M/K Im/ 
such that / = / o 7r, z.e., such that the diagram in Fig. 9.1 A commutative , where 
n:M\-^M/K,x\-^x + K is the natural R-epimorphism. 

Proof Similar to that of groups. □ 

Corollary (Epimorphism Theorem) Let f : M —> A be an R-epimorphism of R- 
modules with ker / = K. Then there exists an R-isomorphism f : M/K -> A of 
R-modules. 

Proof f is an P-epimorphism => f(M) = A =>> Im/ = A =>► / is an P-iso- 
morphism by Theorem 9.4.2. □ 

Theorem 9.4.3 (Second Isomorphism Theorem or Isomorphism Theorem of Quo- 
tient of Quotient) Let M be an R -module and P, A be submodules of M such 
that P C A. Then A/P is a submodule of M/P and there exists a unique natu- 
ral R-isomorphism tv : M/N — > M/P/N/P such that the diagram in Fig. 9.2 is 
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commutative , i.e., n^/p o jtp = n o Tt^, where 

7tp\M^M/P, x i-> x + P VieM, 

7 T/v : M — > M/N, x \-^ x + N WxeM, and 

tt n/ p : M/ P —> M/P/N/P , y\-+y + N/P, Vy g M/P. 

Proof N is a submodule of M =>> A ^ 0 =>► A/P 7 ^ 0. Now x + P, y + P e N/P ^ 
x,y e N ^ x + y e N and rx e TV Vr e P =>► (x + y) + P e N/P and ri + Pe 
N/P Vr g R ^ N/P is a submodule of M/P. Define the natural mapping tt : 
M/N -> M/P /N/P by the rule 

tt(x + A) = (x + P) + A/P. 


Now 

x + y = 3 /-fiV(xjGM) 4> x — yeN 

O (x - y) + P € A/P 4^ (x + P) - (y + P) € A/P 

4^ (x + P) + A/P = (y + P) + N/P 4^ tt(x + TV) = 7r(y + N). 

Thus Tt is well defined and injective. 

Again for (x + P) + N/P e M/P/N/P , 3 an element x + N e M/N such that 
7r(x + A) = (x + P) + A/P. Thus Tt is surjective. Hence Tt is a bijection. Finally, 
it is easy to show that Tt is an P -homomorphism. Consequently, Tt is a natural P- 
isomorphism. 

Uniqueness of Tt: Let if : M/N -> M/P/N/P be an P-isomorphism (by taking 
if in place of Tt) with commutating diagram in Fig. 9.2 i.e., if o = ttn/p o Ttp. 
Then 


if (x + N) = Tt(rt^{x)) = (if o tvn)(x) = (rt^/p o 7tp)(x) 

= *1 v/p{np(x)) = *N/p(x + P) = (x + P) + N/P 
= tt(x + N), Vx + N € M/N 

=4> Tt =Tt => uniqueness of Tt. □ 

Theorem 9.4.4 (Third Isomorphism Theorem or Theorem of Quotient of Quotient 
of a sum) Let M be an R-module. If A and B are submodules of M, then 

(i) the quotient modules A/AHB and ( A + B)/B are R-isomorphic and 

(ii) the quotient modules B/AHB and ( A + B)/A are R-isomorphic. 

Proof (i) Define a mapping / : A/ AH B — > (A + B)/B by f(a -\-AnB) = a + B. 

f is well defined : For a,x e A, a-\-AnB=x-\-AnB^a — x G A H P c 
B^a— xeB^a-\-B=x-\-B^f is independent of the representative a. 
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/ is injective : f(a + AH B) = /(x + + i + 

Again a, x e A ^ a — x e A. Hence fl-ieAn5=^a + An5=x + An5. 

/ is surjective : From the definition of /, it follows that / is surjective. 

Clearly, / is an R -homomorphism. Consequently, / is an ^-isomorphism. 

(ii) Define a mapping g : B /A H B ^ (A -\- B)/A by g(b + A D B) = b + A. 
Then as in (i), g is well defined and an R -isomorphism. □ 


9.5 Modules of Homomorphisms 

In this section R denotes a commutative ring with 1 r and Horn r(M, A ) denotes 
the set of all R -homomorphisms from the R -module M to the R -module A. We 
now continue discussion of module homomorphisms and study the structure of 
HoiurCM, A). 

Proposition 9.5.1 Let R be a commutative ring. Then for R-modules M and A , 
Horn r(M, A ) is an R -module. 

Proof For /, g g Horn r(M, A ), define / + g by the rule (/ + g)(x) = /(x) + g(x), 
Vx g M. Then / + g : M -> A is an R -homomorphism such that (Horn# (M, A ) , +) 
is an abelian group. Define an external law of composition fi\ Rx Horn r(M, A ) — > 
Horn r(M, A ) by /x(r, /) = r/, where rf : M ^ N is defined by (r/)(x) = r/(x) 
VxeM. The composition /i is well defined, because 

<//)(x + y) = r(/(x + y)) = r(/(x) + f(y)) = rf(x ) + r/(j) 

= (r/)(x) + (r/)(y) VxjeM and 

(rf)(sx) = r(f(sx )) = r(sf(x)) = (rs)f(x) = (sr)f(x), 

since 7? is commutative by hypothesis 

= s(rf(x)) =s(rf)(x) VxgM and Vr, s g R. 

Moreover, r(/ + g) = rf + rg; (r + s)f = rf + sf and r(s/) = (rs)/ Vr, s g R 
and V/, g g Hom^(M, A). Consequently, Horn r(M, A ) is an 7? -module. □ 

Remark In absence of commutativity of R , Hom/?(M, A) fails to be an 7?-module. 
Thus the abelian group Horn r(M, A) is not in general an R -module. 

We now examine the particular case when A = R. We can endow Horn r(M, R) 
the structure of a right 7?-module as follows: (/ + g)(x) = /(x) + g(x) V/, g e 
Horn r(M, R) and Vx G M; fi : Hom/?(M, 7?) x 7?^ Hom^(M, 7?) is defined by 
/x(/,r)(x) = (/r)(x) = /(x)r V/ g Hom^CM, 7?) and Vr g 7?. Then (/r)(x + 
)9 = fix + y)r = (f(x) + f(y))r = f(x)r + f(y)r = (fr)(x) + f(r)y Vx, y e 
M and (fr)(sx) = f{sx)r — sf(x)r = s(fr)(x) VxeM and Vr, s g 7?. Hence 
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the group homomorphism fr is indeed an P -homomorphism, and so belongs to 
Hom*(M, R ). 

Definition 9.5.1 If M is a left P -module, then the dual module of M is the right 
P-module M d = Horn r(M, R). The elements of M d are called linear functionals 
(or linear forms) on M. Again we can obtain the dual of the right P -module M d , 
which is the left P -module (M d ) d . We call it the bidual of M and denote it by M dd . 

Proposition 9.5.2 Given a commutative ring R and a fixed R-module M, the R- 
homomorphism f : A —> P induces 

(a) an R -homomorphism /* : Horn r(M, A ) — > Hom#(M, P) defined by = 
/ o a Va e Hom^(M, A ); 

(b) an R -homomorphism f * : Hom^(P, M) — > Horn/? (A, M) defined by /*(/?) = 
0o/ eHom^(P,M). 

Proof (a) /*(« + y) = / o (a + y) by definition. 

Now 


(/ o (a + x))(x) = /((a + x)00) = /(a(x) + y(x)) 

= f(a(x)) + f(y(x)) = (f oct)(x) + (/ o y)(x) 

= (/ o o' + / o y)(x) Wx e M 
=*■ f°(oi + y) = foa + foy 

=> /*(a + y) = /*(a) + /*(y) Va, y g Hom R (M, W). 

Similarly, f*(ra) = rf*(a) Vr e R, Va e Horn r(M,N). Hence /* is an P- 
homomorphism. 

(b) Proceed as in (a). □ 

Corollary For the identity R -automorphism In : A — > A, 

(a) In* : Hornft(M, A) — > Hom^(M, A) is the identity R- automorphism. 

(b) 1* N : Horn/? (A, M) -> Horn/? (A, M) A aA# identity R- automorphism. 

Proposition 9.5.3 R be a commutative ring and M , A, P R-modules. If 
f : M -> A aftd g : A — > P are R-homomorphisms , then for any R-module A, 

(i) (g°/)*: Horn/? (A, M) — > Hom^(A, P) A aa R -homomorphism such that (g o 

y*)* — ° y* » 

(ii) (g o /)* : HoniR(P, A) -> Hom/?(M, A) A an R -homomorphism such that (g o 
/)* = /* og*. 


Proof (i) g o / : M -> P is an P -homomorphism => (g o f)* : Horn/? (A, Af) — > 
Horn/? (A, P) defined by (g o f)*(a) = (g o /) o a is an P -homomorphism by 
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Proposition 9.5.2. Clearly for every a e A, (( g o f)*(a))(a) = (g* o f*)(a)(a) => 

(ii) Proceed as in (i). □ 

Corollary If f : M -> TV is an R-isomorphism , then for any R -module A, the in- 
duced R-homomorphisms 

/* : Horn/? (A, M) -> Horn/? (A, AT) /* : Horn/? (TV, A) Hom/?(M, A) 

are R-isomorphisms . 

Proof f is an ^-isomorphism 3 an 7? -homomorphism g : TV M such that 
8 ° / = 1m and / o g = 1 N . Hence (g o /)* = l M * g* o /* = 1m* by Corollary 
to Proposition 9.5.2 and Proposition 9.5.3. Similarly, /* o g^ = 1^*, /* o g* = 1 m* 
and g*o/* = l^. Hence /* and /* are both ^-isomorphisms. □ 


9.6 Free Modules, Modules over PID and Structure Theorems 

In this section we consider a special class of modules which is the most natural 
generalization of vector space and is also very important in the study of module 
theory. 

Every vector space has a basis, which may be finite or not. But it fails for mod- 
ules over arbitrary rings. This leads to the concept of free modules, which may be 
considered as the most natural generalization of the concept of vector spaces. Free 
modules play an important role in the theory of modules and widely used in al- 
gebraic topology. To extend many results of vector spaces, we also study in this 
section finitely generated modules over principal ideal domains and finally prove 
the structure theorems. 

From now we use the same symbol 0 to represent the zero element of R , zero 
element of M as well as zero module (unless there is confusion). 


9.6.1 Free Modules 

We now proceed to study free modules which are very important in the theory of 
modules and algebraic topology. 

Definition 9.6.1 Fet R be a ring with 1 and M be an R -module. A subset S of M 
is said to be a basis of M iff S generates M and S is linearly independent over R. In 
particular, M is called cyclic iff M = Rx for some x £ M. 

Remark For a cyclic R -module M, there exists an 7?-epimorphism / : R^ M such 
that 7?/ker/ = M. Because, M = Rx for some x e M and R -homomorphism / : 


376 


9 Modules 


R — > M, r r x is an /^-epimorphism. In particular, if is a PID, then ker / = (a) 
for some a e R and the cyclic R -module M is of the form R/(a ), where (a) = 
Ann (M) . 

Proposition 9.6.1 Ler R be a ring with 1. A {left) R-module M is cyclic iff M is 
isomorphic to R/ 1 for some left ideal I of R. 

Proof M is cyclic =>- 3 some x e M such that M = Rx. Then the map / : R — > M, 
r rx, is an 7?-epimorphism with ker / = I (say), which is a left ideal of R. Hence 
M = R/ 1. Conversely, let M = R/I for some left ideal I of R. Then M is cyclic, 
because R/I = (1 0 I). □ 

Proposition 9.6.2 An R-module M is the direct sum of submodules M \ , M 2 , . . . , M n , 
denoted M = M\ © M 2 0 • • • 0 M n iff 

(a) M = Mi 0 M 2 0 • • • 0 M n ', 

(b) Mi +1 H (Ml 0 M 2 0 • • • 0 Mi) = 0, Vi, 1 < i < n - 1. 

Proof Left as an exercise. □ 

Definition 9.6.2 A cyclic R -module M = Rx is called free with a basis {x} iff every 
element y e M can be expressed uniquely as y = rx for some r e R. 

Remark A cyclic 7?-module M = Rx is free with a basis {v} iff Ann(jc) = 0. 

Definition 9.6.3 An R -module M is said to be free on a finite basis iff 

(a) M = Mi 0 M 2 0 • • • 0 M n ; and 

(b) each Mi is a free cyclic R -module. 

If Mi = Rxi, then B = {x\, X 2 , . . . , x n ) is called a basis of the free R -module M. 

Remark Every element y of a free /^-module M with a finite basis B = {x 1 , * 2 , . . . , 
x n } can be expressed uniquely as 

n 

y = Y j r i x i , n e R. 

i = 1 


This definition can be extended. 

Definition 9.6.4 An R -module M is said to be free iff M has a basis. 

More precisely, M is said to be a free R -module with a basis B iff every element 
x g M can be expressed uniquely as x = r^ e R and = 0 except for 

finitely many b’ s. 

Proposition 9.6.3 If M and N are free R -modules, then M 0 N is also a free R- 
module. In general, if {Mi} is any family of free R -modules Mi with basis Bi , then 
0 Mi is a free R-module with a basis B = U/e/ • 
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Proof Let A be a basis of M and B be a basis of N . Then M ® N has a basis 
A x {0} U {0} x B and hence M © N is a free 7?-module. 

The proof of the last part is left as an exercise. □ 

Example 9.6.1 (i) The zero module is a free module with empty basis. 

(ii) If R is a unitary ring, then as a module over itself R admits a basis B = {1}. 

(iii) If R is a unitary ring, then for any positive integer n, R n is a free R -module 
with a basis B = {£;}, where e* = (0, 0, . . . , 1, 0, . . . , 0), 1 is at the i th place, i = 
1, 2, . . . , n. 

(iv) Every vector space V is a free module (over a field), since V has a basis. 

(v) M = 7j n is not a free over Z, because Ann(x) ^ {0} for every x (y^O) e X n . 

(vi) Any finite abelian group G (/0) is not a free Z-module. 

Remark The basis of a free module is not unique. For example, for a unitary ring 
R , R 3 is a free 7? -module with different bases A = {(1,0, 0), (0, 1,0), (0, 0, 1)} and 
B = {(-1, 0, 0), (0, -1, 0), (0, 0, -1)}. 


Theorem 9.6.1 Let R be a commutative ring with 1. Any two finite bases of a free 
R-module M have the same number of elements. 

Proof Let M be a free module with a basis {x\, X 2 , . . . , x n } and I be a maximal 
ideal of R. Then R/I = F is field. Since V = M/1M is annihilated by /, V is a 
vector space over F. If [jq] = x; + 7M (1 < i < ft), then 5 = {[x\], [xf\, . . . , [x n ]} 
is a basis of V over F . Since any two finite bases of a vector space have the same 
number of elements, the theorem follows. □ 

Definition 9.6.5 Let R be a commutative ring with 1. If a free R -module F has a 
basis with n elements, then n is called the rank of F , denoted rank F = n. In general, 
if F is free module with a basis B , then rank F = card B. 

Example 9.6.2 The /^-module R n in Example 9.6. l(iii) is a free module of rankft. 

Remark Theorem 9.6.1 fails for modules over arbitrary rings i.e., for an arbitrary 
ring R , a free R -module may have bases of different cardinalities. 

For example, let V be a vector space of countably infinite dimension over a di- 
vision ring F and R = End^V) = {/:V^V:/isa linear transformation}. 
Then R is a free module over itself with a basis {1^}. Moreover, 3 a basis B n = 
{/l , fl, •••,/«} for R having \B n \ = ft, for any given positive integer ft, where the 
fi ’s are defined as follows: 

Let B = {b\, b 2 , . . . , b n , . . .} be a countably infinite basis of V over F. Define 
fl, / 2 , . . . , f n by assigning their values on bi ’s is presented in Table 9.1. 

Clearly, B n is a basis of R over itself with card B n = n ^ 1 . 

Proposition 9.6.4 Let R be an integral domain and M be a free R-module of finite 
rankft. Then any (ft + 1) elements of M are linearly dependent over R. 
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Table 9.1 

Table defining the functions f\, f 2 , 

•' 1 fn 




f\ 

f. 2 

h 

/4 

fn 

b i 

b 1 

0 

0 

0 

0 

h 

0 

b\ 

0 

0 

0 

bn 

0 

0 

0 

0 

b\ 

bn+l 

b2 

0 

0 

0 

0 

b n + 2 

0 

b 2 

0 

0 

0 

b2n 

0 

0 

0 

0 

b 2 

bmn + 1 

bm+l 

0 

0 

0 

0 

bmn+2 

0 

bm+l 

0 

0 

0 

b(m+l)n 

0 

0 

0 

0 

bm+l 


Proof By hypothesis, Af = /?0/?0---0/? (n-summands). Let 2(/?) = F be the 
quotient field of R. Then M is embedded in F 0 F 0 • • • 0 F (^-summands), which 
is an n -dimensional vector space over F . Hence any (n + 1) elements of M are 
linearly dependent over F . □ 

Remark Vector spaces over a field F are unitary F -modules. Free modules and 
vector spaces are closely related. They have many common properties but they differ 
in some properties. Some of them are given now. 

Like vector spaces it is not true that a linearly independent subset of a free module 
can always be extended to a basis. 

For example, Z is a Z-module having {1} (also {—1}) as a basis. {3} is linearly 
independent over Z but Z / ({3}) . 

Remark Like vector spaces it is not true that every subset S' of a free module M 
which spans M contains a basis for M. 

For example, consider the Z-module Z. Let S = {p,q}, where p , q are inte- 
gers such that gcd(p, q) = 1 and they are not units. Then 1 = ap + bq, for some 
a, b e Z =>- v = (xa)p + ( xb)q Vv g Z =)• Z = (S). But S is linearly dependent 
over Z =>- S is not a basis for Z. 
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Remark A finitely generated free module M may have a submodule which is neither 
free nor finitely generated. 

For example, let F be a field and R = F[x \ , X 2 , . • • , x n , . . .] which is a commu- 
tative ring with 1 (in infinite variables). Then M = R is a free module over R with a 
basis {1}. The submodules of M are precisely the ideals of R. Consider the ideal N 
of all polynomials with constant term zero, i.e., N = (x\, X 2 , . . . , x n , . . .). Then N 
is not finitely generated => N is not a principal ideal => N is not free, because the 
only ideals of R which are free as 7?-modules are non-zero principal ideals. 

Theorem 9.6.2 (a) {x z } ze / is a basis of an R-module M iff M = ® ze/ Rxi, where 
Rxi = Rfor every i e /. 

(b) An R-module M is free iff M is isomorphic to a direct sum of copies of R. 

Proof (a) Let M = Then every element x e M can be expressed 

uniquely as x = ^2 ieI Rxi, where Rx t = Om except for a finite number of Rxf s. 
Hence {jc z } forms a basis of M => M is a free R -module. 

Conversely, let M be a free R -module with a basis {x z }. Then the mapping / : 
R Rxi , defined by /(r) = rxi is an -isomorphism. 

Moreover, D Rxj = { Om } if i # j • Again c. M Vi e I => ® c 

M. Now x g M =>► x = riXjj H 1- r„x Zn , where x ir g {x z } ze / => M c ®- e/ /?x z . 

Hence M = ®- e/ /?x z , where 7?x z = for each i e /. 

(b) It follows from (a). □ 

Remark In view of Theorem 9.6.2, Definition 9.6.4 can be restated as follows: 

Definition 9.6.6 A free R -module is one which is isomorphic to an R -module of 
the form ®- e/ M/, where each Mi = 7? (as an -module). 

A finitely generated free R -module is therefore isomorphic to R 0 • • • 0 R 
(n summands), which is denoted by R n (R° is taken to be the zero module denoted 
by 0). 

Theorem 9.6.3 Let R be a ring with 1. Then M is a finitely generated R-module 
iff M is isomorphic to a quotient of R n for some integer n > 0. 

Proof Let x \ , X 2 , . . . , x n generate M. Define / : R n — >► M by 

n 

f(ai,a 2 ,...,a n ) = ^ a i x i- 
</=l> 

Then / is an R -module homomorphism onto M. Hence M = 7? n /ker/ by Theo- 
rem 9.4.2. 

Conversely, let us have an R -module homomorphism / of R n onto M. If ej = (0, 
0, . . . , 1, 0, . . . , 0) (1 G R being in the jth place), then [eff (1 < j < n) gener- 
ates R n . Hence {f(ej)} (1 < j <n) generates M. □ 
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9.6.2 Modules over PID 

Many results of modules are aimed to extend desirable results of vector spaces. 
For this purpose, we now consider finitely generated modules over principal ideal 
domains (PID). We now show that every submodule of a free module of finite rank 
is also a free module and it is possible to choose bases for the two modules which 
are closely related in a simple way. 

Theorem 9.6.4 Let R be a PID and M be a free module of finite rank r over R 
and N be a non-zero submodule of M. Then there exists a basis {x\, X 2 , . . . , x r } 
of M and an integer t (1 <t <r) and non-zero elements a \ , 02 , . . . , a t in R such 
that {a\x\, a 2 X 2 , . . . , a t x t } is a basis ofN and ai 1 < i < t — 1. 


Proof Let / : M — ► R be an R -homomorphism. Then f(N) is an ideal of R => 3 
some element a / (say) in R such that f(N) = {a f), since R is a PID. Consider the 
set S = {/(AO = {a f) such that / : M -> R is an R -homomorphism}. If / = 0, 
then (ay) = (0) e S => S ^ 0. Again R is Noetherian. Then the non-empty set S 
of ideals of R , partially ordered by inclusion, has a maximal element. Hence 3 a 
maximal element in S , i.e., 3 an R -homomorphism g : M — > R such that the ideal 
g(N) = (a g ) is not properly contained in any other element of S. Let for this g,a\ — 
a g and g(x) = a\ for some x e N. Then (a\) is a maximal element in S =>► a\ / 0. 
Let {y \ , y 2 , . . . , y r } be a basis of M and if x = Y^i=i diyt , then iti \ M ^ R, x t-> di 
is the natural 7?-epimorphism. Hence it follows that a\ 1 7r z - (v) V/. Then 7T/ (v) = 01 s 1 * 
for some ^ e R. Define x\ = y/ • Then a\x\ — Y^i=i a \ s iyi — x ^ a\ — 

g(x) = g(a\x\) = a\g(x\) =>- g(xi) = 1, since a\ / 0. Taking this element x\ as an 
element in a basis for M and the element a\x\ as an element in a basis for N , we 
show that 

(i) M = 0 kerg and (ii) N = Ra\x\ 0 (A Pi kerg). 

To prove (i), let y e M. Then y = g(y)x 1 + (y — g(y)xi). 

Now g(y - g(y)x 1 ) = g(y) - g(y)g(x 1 ) = g(y) - g(y) 1 =0 => y - g(y)x 1 e 
kerg => y e /?xi + kerg =>► M c + kerg. 

Again, rjq g kerg 0 = g(rxi) = rg(xi) = rl = r =>► D kerg = {0}. 
Hence M = Rx\ 0 kerg. 

(ii) is proved similarly. 

We now prove the theorem by induction on t > 0, the rank of N. 

(ii) =>► t = rank N = rank(/?aixi) + rank(A Pi kerg) => N P kerg is an R -module 
of rank£ — 1 =>> N P kerg is free by induction hypothesis N is a free 7? -module 
of rank£ having the basis {a\x\} U B t ~u where B t -\ is any basis of N P kerg (by 
using (ii)). 

To prove the last part we use induction on r, the rank of M. Now (i) => rank 
kerg = r — 1 =>► 3 a basis { X2 , X3, . . . , x r } of kerg such that {02x2, <23x3, . . . , a t x t } 
is a basis of N P kerg by induction hypothesis, for some elements 02,03, ... ,a t in 
R such that < 22 1^3 1 • • • \a t . Consequently, (i) and (ii) show that {x\, X2, . . . , x r ] is a 
basis of M and {a\x\, 02x2 , . . . , a t x t } is a basis of N. 

Finally, maximality of (a\) in S =>- (( 22 ) c ( 01 ) => < 21 1^2. □ 
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9.6.3 Structure Theorems 

We are now in a position to proceed to determine the structure of a finitely generated 
module over a PID, in general and in particular, structure of a finitely generated 
abelian group and finite abelian group. We show in the next theorem that every 
finitely generated module over a PID is isomorphic to the direct sum of finitely 
many cyclic modules. This theorem is the main objective of this section. It is often 
called fundamental theorem or structure theorem for finitely generated modules over 
principal ideal domains. 

Theorem 9.6.5 (Structure Theorem) Let R be a PID and M be a finitely generated 
R-module. Then 


r copies 


M = @R/(qi) © R/(q 2 ) © • ■ ■ © R/(q t ), 

for some integer r > 0 and the non-zero non-unit elements q\ , q 2 , . . . , qt of R such 
that q\ \q 2 \ • • * \q t , where qt ’s are unique up to units. 

Proof Let B = {yi, y2, . . . , y m } be a minimal generating set of M. Then R m is a 
free R -module of rankm. Define 


m 




i = 1 


Then / is an 7 ?-epimorphism =>► M = 7 ? m /ker/ (see Theorem 9.4.2). 

Again by Theorem 9.6.4 applied to R m and its submodule ker /, 3 a basis 
{x\, X 2 , . . . , x m J of R m such that {q\x\, q 2 X 2 , ... , qt*t}, t <m, is a basis of ker / 
for some non-zero non-units qi of R satisfying the divisibility relation q\\q 2 \ • • • \qt . 

Hence M = /? m /ker/ =>> M = (Rx i © Rx 2 © • • • © Rx m ) / (Rq\x\ © Rq 2 X2 © 
• • • © Rq t x t ). 

Consider the natural /?-epimorphism 

h : Rx 1 © Rx 2 © • • • © Rx m R/ {qi) © R/ (q 2 ) © • • • © R/(qt) © R m 1 , 

{r\x\ + r2X 2 + • • • + r m x m ) t-^ (r\ + (^1)) + (r2 + iq 2 )) + ■ ■ ■ + {ft + {qt)) 

+ ( r t+\Xt+\ + ' ' • + 

where qi divides r/ , i = 1 , 2 , . . . , t . 


Then 


ker/z = Rq\x\ © Rq2X2 © • • • © Rqt*t, 


shows that 


M = ®R/(qi)®---®R/(q t ). 


If q is a unit, then R/(q) =0 and hence any such term will not appear in the above 
expression. 
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Taking m — t = r 9 we have M = R r ® R/(qi) 0 • • • 0 R/{q t ). □ 

Remark We show a little later the invariance of r. If r = 0, M is said to be a torsion 

R -module. 

Theorem 9.6.6 Let R be a PID and M be a torsion R-module , then M = R/(q\) 0 
R/ (qi) 0 • * * 0 R/(qt), where (q\), (< 72 ) > • • • , (qt) are uniquely determined up to 
units. 

Proof Taking r = 0 in Theorem 9.6.5, let 

M = R/(q 1 ) © R/(q 2 ) © • • • © R/(q t ) 

= R/ (a\) 0 R/ ( 02 ) 0 • • • 0 R/ (ak) 
where qi\q 2 \ - "I qt andail^l ■■■ I flit- 

If x e M, then v = x\ + X 2 + • • • + x t , for some xi e R/(ai). For a prime p 
in R, let M(p) = {x e M : px = 0}. Then M(/?) is a vector space over the field 
F = R/(p). 

Now v e M(p) px = 0 O pxi = 0, V/ M(/?) is the direct sum of the ker- 
nels of the multiplication by p in each component. Hence dim M(p) over F is the 
number of terms R / {qt ) for which p \qi . 

If p\qi, then p\qi , V/. Again for the second decomposition of M, p must divide 
at least t of af s. Hence t <k. Similarly, k < t. Hence t = k. Let qt = pbi, bi e R 
and 1 < i < t. Then 

pM = pR/{pb\) 0 pR/{pb 2 ) 0 • 0 pR/{pb t ) 

= R/(b\) 0 R/ (Z? 2 > 0 • • • 0 R/(b t ), where & 1 I& 2 I • • • 

By induction on the number of irreducible factors of q\ , it follows that (b \ ),..., (fc* ) 
are determined uniquely up to units. Hence (^ 1 ), {q2 (^) are uniquely deter- 
mined up to units. □ 

Corollary 1 If M = F(M) 0 T (M), 7/z^ T (M) is uniquely determined. 

Remark The torsion part T (M) of M over a PID is unique but F(M) is not unique. 
Because different bases B and B' for M/T(M) give different free parts F and F' 
(say). Then 


M = F(M)®T(M) and M = F / (M)0T(M) 

=* F(M) = M/T(M) = F\M). 

The Structure Theorem 9.6.5 can also be stated as follows: 

Theorem 9.6.7 (Structure Theorem) Let R be a PID and M be a finitely generated 
module over R. Then M can be expressed uniquely as 

M = F 0 R/ (qi) 0 R/ (q 2 ) 0 • • • 0 R/ (qt), 
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where F is uniquely determined up to isomorphism called free part of module M 
and q\, q2, . . . , qt are determined uniquely up to multiplication by units such that 

Proof It follows from Theorem 9.6.6 and Corollary 1. □ 

In particular, for R = Z, the following structure theorem for finitely generated 
abelian groups is obtained. 

Corollary (Structure Theorem) Let G be a finitely generated abelian group. Then 
G = F © Z qi ® Z q2 ® • • • ® Z qt , qi\q2\- — \qt , where F is a free abelian group and 
Z q n is the cyclic group of order q n , 1 <n <t . In particular ; if G is a finite abelian 
group , then G is isomorphic to the direct sum of the qt-Sylow subgroups of G. 

Definition 9.6.7 The rank of F is called the free rank or Betti number of M and the 
elements q \ , q ^, . . . , q t are called the invariant factors of M. 

Remark Torsion modules M are precisely the modules of rank zero and Ann(M) = 
{qt). A torsion free module may not be free as a module. For example, the abelian 
group Q of rationals is torsion free over Z but not free over Z (see Ex. 12 of the 
Worked-Out Exercises). 

Definition 9.6.8 Let R be a PID and M be a finitely generated R -module such that 
Ann(M) = (e). Then e is called the exponent of M. It is unique up to units. 

Existence of e\ Let M = (x \ , X 2 , . . . , x t ). Choose r* (^0) e R such that r&i = 0. 
Then r = f\ r i 0. Hence rM = 0 Ann(M) 0 => Ann(M) = (e), e (^0) e R. 

Proposition 9.6.5 Let R be a PID and M be a finitely generated torsion mod- 
ule over R with exponent e = ab, where a, b e R and gcd(a, b) = 1. Then M = 
M\ ® M 2 , where M\ = {x e M : ax = 0} and M 2 = {x e M : bx = 0}. 

Proof gcd(a, b) = 1 3m, n e R such that ma + nb = 1 =>> v = max + nbx, 

VieM. 

Again, b(max) = ab(mx) = emx = 0 max e M 2 . 

Similarly, nbx e M\. Consequently, M = M\ + M 2 . 

Again x e M\ D M 2 => ax = 0 = bx => max + nbx = 0=^v=0=^Mj H M 2 = 
{0}. Hence M = Mi © M 2 . □ 

Using the Chinese remainder theorem we now proceed to decompose further the 
cyclic modules in Theorem 9.6.5. 

Theorem 9.6.8 Let R be a PID and M be a finitely generated R-module. Let M be a 
non-zero R-module with exponent e. Ife = p\ l p 1 ^ • • • p is the unique factorization 
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of e in R , then 


M = R r © R/lpl ') © /?/(/?2 2 ) ® • • • © 

where r > 0 is an integer and p" 1 , p" 2 , . , p" 1 are positive powers of distinct 
primes in R. 

Proof For i j, ( p * l ) + = (1) = R , since gcd (pt,Pj) = 1 => the ideals 

(p ™ 1 ) are pairwise co-prime, i = 1, 2, . . . , t => p|- (p™ 1 ) = (e), since e is the 1cm of 
pi 1 , P 2 2 , R/(e) = R/{p ”‘> © R/{pf) ® • • • © R/{p?‘), by the Chinese 

Remainder Theorem. 

Hence the theorem follows from Theorem 9.6.5. □ 

Remark This isomorphism is also an isomorphism both as rings and R -modules. 

Definition 9.6.9 Let R be a PID and M be a finitely generated R-module. The 
prime powers p n x l , p !J 2 , . . . , p™ 1 are called the elementary divisors of M. 

Theorem 9.6.9 (Primary Decomposition Theorem) Let R be a PID and M be a 
non-zero torsion R -module with exponent e. 

Ife = p^p^ 2 • * * pT be the unique factorization in R and Mi = {x e M : p n fx = 
0}, 1 < i < t, then M = M\ © M 2 © • • • © M t . 


Proof Each Af z - is a submodule of M such that the ideals {p** ) and {p n d ) are pair- 
wise co-prime for i j . Hence the theorem follows by applying Chinese Remainder 
Theorem for rings (see Ex. 34 of Exercises-I of Chap. 9). □ 

Definition 9.6.10 The submodule M(pi) = [x e M : p^x = 0} of M is called the 
Pi —primary component of M, where pi is an elementary divisor of M. 

Remark The concepts of elementary divisors of a finitely generated module M and 
the invariant factors of the primary components M{pi) of T (Af) coincide. 


Theorem 9.6.10 Let R be a PID and p be a prime element in R and F be the 
field R/(p). 

(a) If M = R n , then M/pM = F n . 

(b) If M — R/(q) for a non-zero element q in R, then 


M/pM = 


F, 

0 , 


if p\q in R 
otherwise. 


(c) IfM = R/(q\) © R/(qi) © • • • © R/(q t ), where p\qi , V/, then M/ pM = F f . 
Proof (a) Consider the natural R-epimorphism 

X : R n ->• (R/{p)) n , (ai,a 2 ,...a n ) (a\ + ( p),ai + {p),...,a„ + {p)). 
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Then kerjr = pR n . Hence R n /pR n = ( R/(p)) n => M/pM = F n . 

(b) Left as an exercise. 

(c) Left as an exercise. 


□ 


Theorem 9.6.11 Let R be a PID. Then two finitely generated R-modules M\ 
and M 2 are 

(a) isomorphic iff they have the same free rank and the same set of invariant factors', 

(b) isomorphic iff they have the same free rank and the same set of elementary 


divisors. 


Proof (a) If M\ and M 2 have the same free rank and the same set of invariant factors, 
they are isomorphic. Conversely let M\ and M 2 be isomorphic. Let rank Mi = n\ 
and rank M 2 = n 2 . Then Mi = M 2 =>► T(Afi) = T(M 2 ) => Mi/T{M X ) = R ni and 
M 2 /T(M 2 ) = R ni . Hence R ni = R ni . Let p (^0) be prime in R. Then M/ pR ni = 
M/pR n2 F ni = F ni as vector spaces over the field F — R/pR => n\ — n 2 - 
(b) If Mi and M 2 have the same free rank and the same set of elementary divisors, 
then they are isomorphic. We now show that Mi and M 2 have the same lists of 
elementary divisors and invariant factors. To do so it is sufficient to consider the 
isomorphic torsion modules T(M\) and T(M 2 ). So we assume that both Mi and 
M 2 are torsion 7?-modules. By using (c) of Theorem 9.6.10, it is proved that the set 
of elementary divisors of Mi is the same as the set of elementary divisors of M 2 . 
Let q\, q 2 , . . . , qt be a set of invariant factors of Mi and a \ , a 2 , . . . , ak be a set of 
invariant factors of M 2 . We can find a set of elementary divisors of Mi by taking the 
prime powers of q\ , g 2 , . . . , q t . Then q\ \q 2 \ • • • \qt => qt is the product of the largest 
of the prime powers among the elementary divisors, ^_i is the product of the prime 
powers among these elementary divisors when the factors of q t have been removed, 
and so on. Similarly, for a\ \a 2 \ • • • \a t . Since the elementary divisors for Mi and M 2 
are the same, it follows that their invariant factors are also the same. If Mi = M 2 , 
then it follows that Mi and M 2 have the same set of invariant factors and the same 
set of elementary divisors. □ 

Corollary 1 Two finitely generated abelian groups are isomorphic iff they have the 
same free rank ( Betti number) and the same set of invariant factors . 

Corollary 2 Two finitely generated abelian groups are isomorphic iff they have the 
same free rank ( Betti number) and the same set of elementary divisors. 


9.1 Exact Sequences 

Definition 9.7.1 A sequence of R -modules and their homomorphisms 



( 9 . 2 ) 


is said to be exact at M n iff Im f n = ker f n +\ . 
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The sequence (9.2) is exact iff it is exact at each M n . In particular, an exact 
sequence of the form given in sequence (9.3), 

0 M' -U M M" 0 (9.3) 

is called a short exact sequence. 

Remark 1 Any long exact sequence (9.2) can be split into short exact sequences. 

Remark 2 If / : M — > N is a module homomorphism, then M/ ker / is called co- 
image of / and is denoted by Coim / and to the contrary, N/lmf called co-kernel 
of / is denoted by Coker/. 

f 

Note that 0 -> M — > N is an exact sequence of modules and their homomor- 

g 

phisms f is a module monomorphisms and N — >► P -> 0 is an exact sequence 
of modules and their homomorphisms g is a module epimorphism. 

f § 

If M — > N — > P is exact sequence of modules and their homomorphisms, 
then g o f = 0. 

f 8 

Finally, if M — > N — > P — >► 0 is an exact sequence of modules and their ho- 
momorphisms, then Coker f = N/lmf = A/ kerg = Coimg = P . 

Proposition 9.7.1 For any sequence of R -modules and their homomorphisms of the 
form (9.2) ( not necessarily exact), f n +\ o f n = 0 iff lm f n c ker f n +\ V index n. 

Proof Im f n c ker f n+ \ =>■ (/„+ 1 o f n )(x) = /„+i (/„(*)) = Om„ + , Vx e M n _ i 

fn+ 1 °/n= 0. 

Conversely, let f n +\ o f n =0. Now, for y elm f n , 3 some x e M n -\ such that 
y = fn CO • Hence (/„+ i o f n )(x) = / n +i (/*(*)) = fn+i(y)- Thus f n+ \ of n =0^ 
fn + 1 00 = 0 Mn+ 1 y e ker /„+! => Im /„ c ker /„+i . □ 

Corollary 7/^/z^ sequence (9.2) A £/zezz / n+ i o f n = 0 for each index n. 

Proof Exactness of (9.2) =>► Im f n = ker f n+ \ Im f n c ker / n+ i / n +i o /„ = 0 
by Proposition 9.7.1 for each index zz. □ 

Remark The converse of the corollary is not true is general. Consider the three term 
sequence: 

z-^>z^>z 3 , 

where Z and Z3 are Z-modules and /, g are defined by f(n) = 6zz, g(zz) = (zz) mo d3, 
respectively. 

Now for x elm f, 3n e Z such that v = f(n) = 6n. Hence g(v) = g(6zz) = 
(6n)mod3 = (0)mod3 => x e kerg =>• Im / c kerg ^ g o / = 0. Now 3 e kerg but 
3 ^ Im f => kerg ^ Im / =>► the sequence is not exact. 
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Theorem 9.7.1 If f : M —> N is an R -homomorphism, O —> M denotes the inclu- 
sion map and N —> O denotes zero R-homomorphism , then 

f 

(i) O — y M — > N is exact O / is an R -monomorphism; 

f 

(ii) M — > N — > O is exact o / is an R-epimorphism; 

f 

(iii) O — ^ M — > N -> O is exact O / is an R -isomorphism. 

f 

Proof (i ) O ^ M — > N — > O is exact ker / = {0 M } of is an R- monomor- 
phism. 

/ 

(ii) M — > N — >► O is exact Im / = N O f is an 7? -epimorphism. 

(iii) follows from (i) and (ii). □ 

f 8 

Corollary If the sequence O -> M — > N — > P -> O is exact , then f is an R- 
monomorphism g is an R-epimorphism and g induces an R -isomorphism. 

g : N/f(M) -> P. 

Proof The proof follows from Theorem 9.7.1 and the Epimorphism Theorem for 
modules. □ 

Example 9.7.1 (a) Let / : A -> B be a homomorphism of abelian groups. Then 
each of the following sequences is a short exact sequence. 

(i) O —> ker / — 1 -> A A / ker / —> O , where i : ker / A is the inclusion 

map and Tt : A -> A /ker / is the canonical epimorphism defined by tv (a) = 
a + ker/. 

(ii) O — y Im / — U- B B/lmf — > 0 , where i : Im / B is the inclusion 

map and it : B 7?/Im/ is the canonical epimorphism defined by 7r(Z?) = 
b + Im/. 

(b) Let A be an 7? -module and 5 be a submodule of A. Then the sequence O — > 

B A A/7? ^ O is exact, where / : 5 ^ A is the inclusion map and 7r : 
A -> A/ 5 is the canonical R -epimorphism defined by tv (a) = a + B. 

(c) Let A and B be R -modules and / : A B be an R -homomorphism. 

Then the sequence O ker / — l -+ A — 5 B/f(A) —> O is exact, where 
i : ker / ^ A is the inclusion map and 7r : B — > B/f(A ) is the canonical 7?- 
epimorphism defined by 7r(Z?) = b + /(A). 

(d) Given 7? -modules A and B , there always exists at least one extension of B 
by A. 

Is the extension of B by A unique for arbitrary modules A and B ? 

[/7m/\ Consider the short exact sequence 

0 * A — f —^ A®B — B — » 0, 


where f{a) = (a, 0) and g(a,b) = b. Then A ® B is an extension of B by A. 
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For the second part, consider the short exact sequence by taking A = Z and B = 
Z n in first part and other by taking 

0 — ► z — z g > Z„ — ► 0, 

where /„ : Z -> Z, x i-> nx and g is the natural projection.] 

Definition 9.7.2 An exact sequence of R -modules and their homomorphisms of the 
form: 

/ 

(a) M — > A -> O is said to split iff 3 an R -homomorphism g : A — > M such that 
f og = \ N \ 

f 

(b) O — M — > A is said to split iff 3 an R -homomorphism h : A — >► M such that 
ho f = 1 m- 

Such 7? -homomorphisms g and /z are called splitting 7? -homomorphisms. 

Definition 9.7.3 A short exact sequence of R -modules and their homomorphisms 
O — ^ A — M P -> O is said to split 

(a) on the right iff M P O splits; 

(b) on the left iff O -> A — M splits. 

f 8 

Theorem 9.7.2 Let O — > M — > A — > P — > O be an exact sequence of R- 
modules and their homomorphisms. Then the following statements are equivalent : 

(a) the sequence splits on the right ; 

(b) the sequence splits on the left ; 

(c) Im / = kerg is a direct summand of A. 

Proof (a) =>- (c). Let it : P -> A be a right splitting 7? -homomorphism. Then g o 
7T = lp. We now consider kerg H Inur = A (say). If x € A, then g(jc) = O p and 
x = it(p) for some p e P . Hence Op = g(jc) = g(jt{p)) = (go 7r)(/?) = 1 />(/?) = 
p ^ x = i x(p) = it(Op) — On => A = {O^}- Again, for every y e TV, g(y — (7r o 
g)00) = §(y) -(gojro g)(y) = g(y) (lp O g)(y) = g(y) g(y) = 0 F ^ y - 
(Jt o g)(y) ekQYg,Vy e N. 

Now, y = (tt o g)(y) + (y — (tt o g)(y)) e Inur + kerg Vy e A =>► A c Inur + 
kerg. Again Inur +kerg being a submodule of A, Im 7 r + kerg c A. Consequently, 
A = Inur ® kerg =>- Im / = kerg is a direct summand of A =>- (c). 

(c) => (a). Let Im f — kerg be a direct summand of A. Then A = kerg ® A for 
some submodule A of A. Hence every n e A has a unique representation as n = 
y + a for some y e kerg and some a e A. Consider the mapping x/r = gU : A — > P. 
Then iff is an ^-isomorphism with inverse \//~ l : P — > A =>► xjr o = lp =>- 
is a right splitting 7? -homomorphism =>► (a). 

Thus (a) (c). 

Similarly (b) (c). □ 
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Fig. 9.3 The Three Lemma 
diagram 


A 

a 

M 


f 


h 


B 


N 


9 


k 


C 

7 

P 


Fig. 9.4 The Four Lemma 
diagram 


A 


f 


B 


9 


C 


h 


D 


A 


p 

7 

d 

f ‘ . r 

l 9‘ ^ ] 

ti ] 


B‘ 


Theorem 9.7.3 (The Three Lemma) Let the diagram in Fig. 9.3 of R -modules and 
their homomorphisms he commutative with two exact rows. Then 

(i) a, y, and h are R-monomorphisms =>► ft is an R -monomorphism’, 

(ii) a, y, and g are R-epimorphisms => ft is an R-epimorphism; 

(iii) a, y are R -isomorphisms, h is an R -monomorphism and g is an R-epimor- 
phism => ft is an R-isomorphism. 


Proof Commutativity of the given diagram =^hoa = fof and k o ft = y o g. 
Again the exactness of the given rows =>> Im / = kerg and Im h— ker k. 

(i) b e ker/3 => f(b) = O n => k(P(b)) = k(0 N ) => (k o f)(b) = O p =► (y o 
g)(h) = Op =>► y (g(b)) = y (Oc) => gib) = Oc , since y is an R -monomorphism =>► 
b e kerg = Im/ b = f(a) for some a e A =>► f(b) — f(f(a)) =>► On = 
(ft o f)(a) = (h o a)(a) = h(a(a)) =>- /i(Om) = h(ot(a)) => Om = o'(a), since /z 
is an 7? -monomorphism o'(Oa) = a (a) =>► Oa = , since a is an /?-monomor- 
phism =>► /(Oa) = /(a) ^ Op = b ^ ker/? = {O#} =>> ft is an ^-mono- 
morphism. 

(ii) n e N k(n) e P => k(n) = y(c) for some c e C, since y is an R-e pi- 
morphism =>► &(n) = y(g(&)) for some b e B, since g is an 7?-epimorphism => 
k(n) = (y o g)(b) = (k o ft)(b) = k(ft(b)) => k(n - ft(b)) = 0 P ^n- ft(b) e 
ker k =>► n — ft(b) e Im h =>► n — ft(b) = h(m) for some m e M =>► n — ft(b) = 
h(a(a)) for some a e A, since a is an /?-epimorphism => n — /3(b) = (h o a) (a) = 
(ft o f)(a) = ft(f(a)) =>► n = fib) + ft(f(a)) = ft(b + f(a)). Thus for each n e N, 
3 an element b + f (a) e B such that n = ft(b + f(a)) =>► ft is an 7?-epimorphism. 

(iii) follows from (i) and (ii). □ 

Theorem 9.7.4 (The Four Lemma) Let the diagram in Fig. 9.4 of R -modules and 
R -homomorphisms be commutative with two rows of exact sequences. 

Then 

(i) a, y are R-epimorphisms and 8 is an R -monomorphism =>► ft is an R-epi- 
morphism; 

(ii) a is an R-epimorphism and ft, 8 are R-monomorphisms => y is an R -mono- 
morphism. 
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Fig. 9.5 The Five Lemma 
diagram 


f 9 

A ► B ► C 
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D 
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E 


p 

y 
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X 
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l 9‘ ^ \ 
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k‘ ; 


B‘ 
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D 
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Proof Commutativity of the given diagram /3 o f = f' o a; y o g = g' o /3 and 
S o h = h' o y. Again exactness of the given rows => Imf = kerg, Img = ker h, 
Im f — kerg', and Img' = ker h' . 

(i) b' e B' =>► g'(b') g'(b') = y(c) for some c e C, since y is an R- 

epimorphism => h'(g'(b')) = h'(y(c )) =>► ( h ' o g')(b') = (h' o y)(c) =>> Oq' = (8 o 
h)(c ) = 8(h(c)), since h' og' = 0 by the corollary to Proposition 9.7.1. => 8 (Op) = 
8(h(c )) => Od = h(c). since 8 is an R -monomorphism => c e ker h = Img =>► c = 
g(Z?) for some b e B ^ y(c) = y(g(b)) = (y o g)(b) => g'(fe') = (g' o /3)(Z?) =► 
g'(*') = g\m) =► *'(*' - ]8(i)) = 0 C ' =► - m g kerg' = Im/' => - 

/3(Z?) = f'(a') for some a' e A' ^ b' — /3(b) = f f (a(a)) for some a g A, since 
a is an /?-epimorphism ( / ' o a) (a) = (/3 o f)(a) = /3(f(a)) =>► b' = /3(Z?) + 
P(f(a)) = fib + f(a)). Thus given b' e B ', 3 an element such that 

/3(Z? + /(a)) = b' =>► ft is an 7?-epimorphism. 

(ii) c g kery =>► y(c) = 0 C / =>> A'(y(c)) = (A' o y)(c) = 0/y =>► (/z 7 o y)(c) = 

8(Od) (<5 o h)(c) = 8{Od) 8(h(c)) = 8(Or>) =>• A(c) = Od , since 5 is an 

7? -monomorphism c G ker A = Img c = g(&) for some b e B => y(c) = 
Y(8(b)) =► y(c) = (y o g)(b) => y(c) = (g' o 0)(A) =► 0 C ' = * W)) =► /3(b) g 
kerg' = Im/ 7 /*(&) = /'(a') for some a f e A f ^ /3(b) = f r (a(a)) for some 
a e A, since a is an /?-epimorphism =>►/$(&) = (/' o a) (a) => /3(b) = (/3 o f)(a) = 
f3(f(a)) => b = f(a), since /3 is an R -monomorphism =>► g(b) = g(f(a)) = (go 
f)(a) = Oc , since go/ = 0=^c = g(&) = Oc =>> kery = {Oc} y is an R- 
monomorphism. □ 

A more frequently used consequence is 

Theorem 9.7.5 (The Five Lemma) Let the diagram in Fig. 9.5 of R -modules and 
their R-homomorphisms be commutative with two exact rows. If a, /3, 8 , X are R- 
isomorphisms, then y is also an R-isomorphism. 

Proof Apply the Four Lemma to the right hand squares of the given diagram to show 
that y is an 7?-epimorphism. Again apply the same lemma to the left hand squares 
of the given diagram to show that y is an R -monomorphism. Hence it follows that 
under the given conditions, y is an R -isomorphism. □ 

Remark 1 The Short Five Lemma follows from Theorem 9.7.5 as a particular case. 
However, an independent proof is given in Lemma 9.7.1. 

Remark 2 The weaker hypothesis of Theorem 9.7.5 (The Five Lemma) shows in 
more detail that 
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Fig. 9.6 Commutativity of O > M ► N ► P ► O 

diagram of two short exact 

sequences f 9 h 

O > ► A/-j > P A ► O 


Fig. 9.7 Commutativity of 
diagram 
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B 


9 


C 



P 

B' 


9 ‘ 


y 


C‘ 


o 
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(a) a is an P-epimorphism and /3, 8 are P-monomorphisms => y is an P -mono- 
morphism. 

(b) X is an P -monomorphism and f3, 8 are P-epimorphisms => y is an P-e pi- 
morphism. 

Two short exact sequences O^M^N^P^O and O — > M\ — > A^i — > 
Pi —> 0 of P -modules and their homomorphisms are said to be isomorphic iff 3 
isomorphisms / : M -> M\ , g : N -> N\ and h : P -> Pi such that the diagram in 
Fig. 9.6 is commutative. 

Note that 

(i) isomorphism of short exact sequences of R -modules and their homomorphisms 
is an equivalence relation; 

(ii) short exact sequences of P -modules and their homomorphisms form a category 
(see Example B.1.1 of Appendix B). 

Lemma 9.7.1 (The Short Five Lemma) Let R be a ring and Fig. 9.7 shows a com- 
mutative diagram with rows of short exact sequences of R -modules and their homo- 
morphisms , and if 

(i) a and y are monomorphisms, then so is f3 ; 

(ii) a and y are epimorphisms , then so is ft ; 

(iii) a and y are isomorphisms , then so is /3. 

Proof (i) Suppose /3(b) = 0 for some b e B. We shall show that b = 0. Now 
yg(b) = g' /3(b) = 0 => g(b) =0, since y is a monomorphism =>► b e ker g = 
Im f => b = f(a) for some a e A =>► f'a(a) = /3f(a) = /3(b) = 0 =>- a (a) = 0 
(since ker f' = 0 =>► /' is a monomorphism) =>► a = 0 (since a is a monomor- 
phism) =>- /3 is a monomorphism (since b = f (a) = f( 0) = 0). 

(ii) Take b' e B’ . Then g'(b') g'(b') = y(c) for some ceC^c = g(b) 

for some b e B (since Img = C by exactness of the top row at C) =>► g f /3(b) = 

yg(b) = y(c) = g'(b ') =► g'(f3(b) -b') = 0=> f3(b) - b' e kerg / = Im /' =► 
/3(b) — b’ — f'(a') for some a' e A! . Clearly, a is an epimorphism => a' = a (a) 
for some a e A. 

Now b - f(a) e B. Then f>(b - f(a)) = /3(b) - /3f(a). 
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By commutatively of the diagram, f>f(a ) = f'oi(d) — f (a') = /3(b) — b r => 
f(b — f(a )) = b r => ft is an epimorphism. 

(iii) follows from (i) and (ii). □ 


9.8 Modules with Chain Conditions 

Let M be an R -module and S(M) the set of all submodules of M. Then analogous 
to rings, S(M ), ordered by the inclusion relation c is said to satisfy 

(i) the ascending chain condition iff every increasing chain of submodules M\ c 
M 2 C ... in S(M) is stationary (i.e., there exists an integer n such that M n = 
M n +\ = •••)? 

(ii) the maximal condition iff every non-empty subset of S (M) has a maximal ele- 
ment. 

Remark The conditions (i) and (ii) are equivalent. 

If S(M) is ordered by 5, then (i) becomes the descending chain condition and 
(ii) becomes the minimal condition and they become equivalent. 

Definition 9.8.1 A module M satisfying either of the equivalent conditions (i) or 
(ii) for submodules is said to be Noetherian (named after Emmy Noether). 

Definition 9.8.2 A module M satisfying either the descending chain condition or 
the minimal condition for submodules is said to be Artinian (named after Emil 
Artin). 

Theorem 9.8.1 The following three statements for an R -module M are equivalent : 

(i) M satisfies ascending chain condition for submodules. 

(ii) Maximal conditions (for submodules) holds in M. 

(iii) Every submodule of M is finitely generated. 

Proof Proceed as in Theorem 7.1.1. □ 

A Noetherian module is now redefined. 

Definition 9.8.3 An R -module M satisfying any one of the three equivalent condi- 
tions of Theorem 9.8.1 is said to be Noetherian. 

Remark If R is a PID, then every non-empty set of ideals of R has maximal element 
and hence R is Noetheiran. 


Theorem 9.8.2 A homomorphic image of a Noetherian module is also Noetherian. 
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Proof Proceed as in Theorem 7.1.4. □ 

Corollary If A is a submodule of a Noetherian module M, then the quotient module 
M/A and A are also Noetherian. 

Proof A is Noetherian by Theorem 9.8.1. Again M/A being the homomorphic 
image of the natural 7?-epimorphism n : M M/ A is also Noetherian by Theo- 
rem 9.8.2. □ 

Theorem 9.8.3 Let M be a module and N a submodule of M. If N and M/N are 
both Noetherian , then M is also Noetherian. 

Proof Proceed as in Theorem 7.1.5. □ 

Corollary 1 In an exact sequence O — > M\ —> M -> M 2 —> O of R -modules and 
their homomorphisms , M is Noetherian iff M\ and M 2 are both Noetherian. 

Proof It follows from Theorems 9.8.2 and 9.8.3. □ 

Corollary 2 Let M be an R -module and A, B be submodules of M. If M = A ® 
B and if both A, B are Noetherian , then M is Noetherian. A finite direct sum of 
Noetherian modules is Noetherian. 

Proof A x B contains A as a submodule whose factor module is isomorphic to B 
(see Worked-Out Exercises 1). Hence by Theorem 9.8.2, A x B is a Noetherian 
module. 

Consider the mapping / : A x B -> M, defied by f(a, b) = a + b. 

Since M = A ® B, it follows that / is well defined and is an /?-epimorphism. 
Then M being the homomorphic image of the Noetherian module A x B, M is 
Noetherian by Theorem 9.8.2. 

The last part follows by induction. □ 

Proposition 9.8.1 Let R be a Noetherian ring and let M be a finitely generated 
R-module. Then M is Noetherian. 

Proof Let M = (x \ , * 2 , • • • , x n ). There exists a homomorphism / : R x R x • • • x 

R = R n -> M, defined by f(r\, r 2 , . . . , r n ) = r\x\ + r 2 x 2 H b r n x n . 

Then / is an epimorphism. The product R n being Noetherian, M is Noetherian 
by Theorem 9.8.2. □ 

Theorem 9.8.4 Let M be an R-module. Then the following two statements are 
equivalent : 

(i) M satisfies descending chain condition for submodules. 

(ii) Minimal condition (for submodules) holds in M. 
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Proof Proceed as in Theorem 7.1.3. □ 

An Artinian module is now redefined. 

Definition 9.8.4 An R -module M satisfying any one of the equivalent conditions 
of Theorem 9.8.4 is called Artinian. 

Remark Analogues of Theorems 9.8.2 and 9.8.3 hold for Artinian modules. 

Theorem 9.8.5 (Fitting’s Lemma) For any endomorphism f of an Artinian and 
Noetherian module M, there exists an integer n such that M = Im f n ® ker f n . 

This lemma was proved by H. Fitting in 1933. 

Proof Left as an exercise. □ 


9.9 Representations 

A basic tool of ring theory is the representation of a ring R , in the ring End(M) of 
endomorphisms of an abelian group M. Representations of R and R -modules are 
closely related. 

If M is an abelian group and R = End(M), then M can be made into a left R- 
module by defining an action: R x M ^ M, (f,x) i-> f(x) WfeR,xeM. On the 
other hand, if M is a left 7^-module, then for each r e R, the mapping I r : M — ► M , 
v i-> rx is an endomorphism of M. Then the mapping p : R — > End(M), r I r is 
a ring homomorphism (called a representation of R). 

Definition 9.9.1 A homomorphism of a ring R into the ring End(M) of all endo- 
morphisms of an abelian group M is called a representation of R . 

In particular, if M is a left 7?-module, the mapping p : R -> End(M), p — >► I r is 
the representation of R associated with M. 

Definition 9.9.2 A left R -module M is said to be faithful iff the representation of 
R associated with M is injective. 

Theorem 9.9.1 Let R he a ring and I a minimal left ideal of R. If M is a faithful 
simple left R -module, then M is isomorphic to the left R-module I . 

Proof Let ao be a non-zero element of I. Since M is faithful, aox 0 for some 
x e M. Hence the map p x : I — >► M defined by p x (a) = ax (a e I ) is a non-zero 
homomorphism. Hence by Schur’s Lemma, p x is a monomorphism and also an 
epimorphism (since the R -module M is simple). Consequently, p x is an isomor- 
phism. □ 
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Fig. 9.8 Commutativity of 
the rectangular diagram 


M 


N 


71 A 

M/A 


f* 


ji b 

N/B 


9.10 Worked-Out Exercises 

1. Let M be an R -module and A, B be submodules of M such that M = A © B. 
Then 

(a) the R -modules M/A and B are R -isomorphic; 

(b) the R -modules M/B and A are R -isomorphic. 

Solution (i) Let v e M = A ® B . Then v has the unique representation as v = 
a + b, a e A and b e B. Consider the mapping 

f : M B, defined by f(x) = b. 

Then / is an 7?-epimorphism such that ker / = A. Hence by Epimorphism 
Theorem for modules, M/ker / = B => M / A = B . 

(ii) It follows similarly. 

2. Let M, N be R -modules and / : M — > N be an R -homomorphism. Let A be 
a submodule of M and B be a submodule of N such that /(A) c B. Then 
there exists a unique R -homomorphism /*: M/ A N/B making the diagram 
in Fig. 9.8 commutative, i.e., ttb o / = /* o tta, where tta and tcb are the 
respective natural epimorphisms. 

Solution Define /* : M/A — > N/B by /*(m + A) = /(m) + B. 

Then /* is well defined and an 7? -homomorphism, making the given diagram 
commutative. 

Uniqueness of /*: Let g:Af/A— >7V/5bean 7? -homomorphism, making 
the given diagram commutative. Then g o n a = kb o /. Now g(m + A) = 
g(7T A (m)) -(go n A )(m) = (n B o /)(m) = n B (f(m)) = /(m) + fi = /*(m + 
A) Vm + A g M/A =>- g = /* uniqueness of /*. 

3. Consider /* defined in Worked-Out Exercise 2. Then 

(a) /* is an 7? -monomorphism <^/ -1 (7?) = A; 

(b) /* is a 7? -epimorphism Im / + 5 = TV. 

Solution (a) 


ker f* — + A g M /A : f*(x + A) — + B } 

= {* + AcM/A :/(*)£#} 

= {x + A e M/A :jc<e/ _1 (5)} 

= /- 1 (B)/A. 


Hence /* is a 7? -homomorphism o / 1 ( Z? ) = A. 
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(b) Let /* be an T?-epimorphism. Then for every n + B e N/B, 3m + A e 
M/A such that /*(m + A) = n + B. Hence f(m) + B = n + B =>► /(m) — n e 
B f(m ) — n = b for some b e B f(m) — b = n^nelmf-\-B^ 
c Im / + 5 c by hypothesis =>- Im/ + .6 = TV. For the converse, take 
n + B e N/B. Then 3m e M such that n = f(m) + b => f(m) = n — b for some 
b e B => /* (m + A) = f(m) + B = n — b + B = n + B ^ f* is surjective => /* 
is an 7?-epimorphism. 

4. Let M, N be 7? -modules and A, B be submodules of M and TV, respectively, 
and / : M -> TV be an R -homomorphism. Then the following statements are 
equivalent: 

(a) /(A)C5; 

(b) There exists a unique R -homomorphism /* : M/A -> TV/ 5 making the 
diagram of W.O. Ex. 2 commutative. 

Solution Proceed as in Worked-Out Exercise 2. 

5. Let M be a unitary R -module. Then M is simple M and R/K are isomorphic 
for some left ideal K of R . 

Solution M is simple M = Rx for some v Om ) e M by Theorem 9.2.4. 

Consider the mapping / : R — > M defined by f(r) = rx, Vr e R. Then / is 
an 7?-epimorphism such that K = ker / = [r e R : f(r) = Om) is a left ideal 
of R. Hence by Epimorphism Theorem M and R/K are isomorphic. 

6. Let M be a unitary 7? -module. If M is simple, then M and R/K are isomorphic 
for some maximal left ideal K of R. 

Solution By Worked-Out Exercise 5, M and R/K are isomorphic. Since M 
is simple, R/K is also a simple unitary R -module. Hence by Corollary 1 to 
Theorem 9.2.4, R/K is division ring =>► K is a maximal left ideal of R. 

7. Let R be a commutative ring and M, TV be R -modules. Then 

(a) Horn r(M, N ) is a left module over the ring End(TV, +); 

(b) Horn r(M, N ) is a right module over the ring End(M, +). 

Solution (i) Define a mapping [i : End(TV , +) x Horn r(M, N ) Horn r(M, N ) 

by jLt(/, g) = f ° g- Then /x is a scalar multiplication. 

(ii) Define a mapping p : Horn r(M, N ) x End(M, +) Hom/?(M, TV) by 
p(g, f) = g o f . Then p is a scalar multiplication. 

8. The Z-module Hom z (Q, Z) = {0}. 

Solution Let / e Hom z (Q, Z). Then / : Q Z is a Z-homomorphism. Now 
1 e Q /(l) e Z. For any integer q (^0), l/<? e Q => /(l) = f(q ■ \) = 

qf{\) =>■ /(p = ^/(l) /(l) is a multiple of q =*q\f(\) Vq (#0) e 

Z /( 1) = 0. For any rational number p/q e Q, f(p/q ) = pf(\/q) = P • 
^-/(l) = 0=^/isa zero Z-homomorphism => Hom z (Q, Z) = {0}. 

9. (a) Hom z (Z 2 ,Z) = {0}. 

(b) Hom z (Z 2 ,Q) = {0}. 
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Solution (a) Z 2 = {(0), (1)}. Then for / e Homz(Z 2 , Z), / : Z 2 —> Z is a Z- 
homomorphism. Now /(( 1)) gZ=^ /((l)) = n for some neZ=^0 = /((0)) = 
/((l) + (1)) = /((D) + /((!)) = w + w = 2*=>w=0=> /((I)) = 0. Thus 
/ = 0. Hence Hom z (Z 2 , Z) = {0}. 

(b) Proceed as in (a). 

10. Let R be a commutative ring. 

OL P 

(a) If O -> A — > 5 — > C is an exact sequence of R -modules and TGhomomor- 
phisms, then for any TGmodule M, the sequence O — > Horn r(M, A) 

Horn r (M, B ) A Horn R (M, C ) is exact. 

on f 3 

(b) If A — > B — > C — y O is an exact sequence of R -modules and /Ghomomor- 

p * 

phisms, then for any R -module M, the sequence O — >► Horn r(C, M) — > 

a* 

Ho uir(B, M) — > Horn/? (A, M) is exact (a* and a * are defined in Propo- 
sition 9.5.2). 

(c) Show by examples that the exactness of the sequence of R -modules and 

/ 8 

their homomorphisms O — >► B — > C — > D -> O does not in general im- 
ply the exactness of any of the following sequences: 

/* 

(i) For a given /^-module M, the sequence O — >► Horn/? (M, B) — > 

8 * 

Horn r(M, C ) — > Horn r(M, D) — > O is not in general exact. 

g* 

(ii) For a given /? -module M, the sequence O — >► Horn r(D,M) — > 

f* 

Horn r(C, M) — > Horn/? (B, M) — >► O is not in general exact. 

Solution (a) Use the definition of a* and to show that under the given con- 
ditions a* is an R -monomorphism and Ima* = ker /*. 

(b) It is sufficient to prove that is an R -monomorphism and Im/* = 
ker a*. 

(c) (i) Consider the exact sequence 

O Z -4 Q -U Q/Z o, 

where i : Z Q is the inclusion map and q is the canonical Z-homomorphism 
defined by q(x) = x + Z and take M = Z 2 . Then the sequence O — > Homz(Z 2 , 

Z) Homz(Z 2 , Q) Homz(Z 2 , Q/Z) -> O is not exact. This is because 
q* is not surjective, since Homz(Z 2 , Q) = O but Hom z (Z, Q/Z) / O. 

(ii) Consider the exact sequence (i) and take M — Z. Then the sequence 

O Homz(Q/Z, Z) Homz(Q, Z) Hom z (Z, Z) ->► O is not exact. 
This is because Homz(Q, Z) = O and Homz(Z, Z) = Z and hence /* is not 
surjective. 

11. A finite abelian group G / {0} is not a free Z-module. 

Solution By Lagrange’s Theorem if n (>0) is the number of elements of G, 
and if x e G, then nv = 0, so that {x} is not linearly independent over Z for any 


398 


9 Modules 


x G G. Hence no non-empty subset of G is linearly independent. Thus G has 
no basis over Z =>► G is not a free Z-module. 

12. The module Q over Z is not free i.e., Q is not a free Z-module. 

Solution Let p/q (^0) G Q. Then np/q = 0 n = 0 =>> {p/q} is linearly in- 
dependent over Z. Again, if p/q and a/b are two different rational numbers, 
then (< aq)(p/q ) — ( pb)(a/b ) = 0, where aq, pb G Z. Hence p/q and a/b are 
linearly dependent over Z. We now show that no singleton set can generate Q. 
To show this, let {1 //?}, where p is prime, generate Q. As 1/2/? g Q, 3n e Z 
such that n • \/ p — 1/2/?. Then n = (1/2) ^ Z =>► a contradiction. In this way 
we find that Q admits no basis over Z. In other words Q is not a free Z-module. 

13. Let M be an R -module. 

(a) If a, b e R , then a — b e Ann (AT) in R => ax = bx Vx g M. 

(b) If M is endowed with the R/ Ann(M) -module structure, then find the anni- 
hilator of M in R/ Ann (M). 

Solution (i) Let a, Z? G be such that a — be Ann (M) (in 7?). Then (a — Z?)x = 
Om Vx e M ^ ax = bx Vx e M . 

(ii) Ann(M) (in R) is a two-sided ideal in R => R/ Ann(M) is ring. M can 
be endowed R/ Ann(M) -module structure (see Remark of Proposition 9.2.3) 
under the external law of composition: 

R/ Ann(M) x M -> M, (r).m g> rm V(r) g R/ann(M). 

This composition is well defined. This is because t e (r) => (t) = (r) in 
R/ Ann(M) = 5 (say) =>> t + Ann(M) = r + Ann(M) r — t G Ann (M) => 
rm = tm Vm G M by (i). 

Then annihilation of M in 

S — { (r) g S : (r) • m — Om Vm G M j 

= {(r) e S \ rm = Om Vm G M } 

= {(r) g 5 : r belongs to the annihilator of M in R} 

= {^}. 

14. (a) Let R be a commutative unitary ring and M be an R -module. Given r g R, 

rM and M r are defined by 

r M = {rm : m G M} and 
M r = {m e M : rm = Om }• 

Then rM and M r are both submodules of M. 

(b) In particular, if R = Z and M — Z w , where n = rs and gcd(r, 5 ) = 1, then 
rM = M 5 . 

Solution (a) Om e rM =>► rM / 0. Let rx, ry G rM, where x, y G M. Then 

rx + ry = r(x + y) G rM, since x + y G M. Again, G R, a(rx) = (ar)x = 
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( ra)x (since R is commutative) => r{ax) G rM , since ax e M. Consequently, 
rM is a submodule of M. 

Similarly, M r is a submodule of M. 

(b) Suppose R = Z and M = Z w , where n = rs and gcd(r, s) = 1 . 

Now gcd(r, s) = 1 => 3a, b e Z such that ra + sZ? = 1. 

Let m e M = Z n . Then rm g n M . Hence s(rm) = (sr)m = nm = 0 in Z /? =>► 
rm G M >s . =>> r M C My. 

On the other hand, let m e M s . Then sm = Om- 

Again ra + x/i = 1 =>> m — 1 • m — (ra + sb)m = ram + sbm = r(am) + 
s(bm) = r(am) + b(sm) = r(am) e r Af, since am g M and sm = Om • 

Thus m g rM =>► M s <^rM. Consequently, rM = M s . 

15. Let M be an R -module and A, B, C be submodules of M such that Acfi, 
A + C = B + C,AnC = BHC, then A = B. 

Solution ( S(M ), c) forms a modular lattice (see Theorem 9.2.6). Since AC5 
by modular law, (A + C) H B = A + (B Hi C). Hence A = A + (A n C) = 

a + (5 n c) = (^ n c) + a = ^ n (A + c) = b n (B + c) = b. 


9.10.1 Exercises 

Exercises-I 

1. For an additive abelian group M, let End (M) be the ring of endomorphisms of 
M. If R is any ring show that M is an R -module iff there exists a homomor- 
phism 0 : R — ^ End (M). 

[//mk Let M be an R -module with an action R x M -> M, denoted by 
(r, x) t-^ rx. Define 0 : R — >► End(M) by r i-> 0(r), where 0(r) : M ^ M is 
given by i ri Vi e M and r e R. Then 0 is a homomorphism of rings. 
Conversely, suppose 0 : R ^ End (M) is a ring homomorphism. Define the ac- 
tion R x M -> M, (r, x) \-+ rx = (6(r))(x) Vr g R, x g M. Then M is an 
7? -module.] 

2. Show that a subring S' of a ring R is a module over the whole ring R only if S 
is an ideal of R . 

3. (a) Let / : M N be an R -homomorphism of modules with P = ker /. Show 

that there exists a unique module monomorphism / : M/P — >► N such that 
/ = / o 7T , where tv : M M/P is the natural module homomorphism, 
(a) Let R be a ring and N a submodule of an R -module M. Show that there 
is a one-to-one correspondence between the set of all submodules of M 
containing N and the set of all submodules of M/A, given by P P/N. 
Hence prove that every submodule of M/ N is of the form P/N , where P 
is a submodule of M which contains N. 

4. Let M be an R -module. Show that the submodules of M form a complete lattice 
with respect to inclusion. 
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5. Let L, M, N be R -modules such that iVCMCL. Show that ( L/N)/(M/N ) = 
L/M. 

[Hint. Define f : L/N —> L/M by f(x + N) = x + M Vx g L. Check that 
/ is a well-defined R -module homomorphism of L/N onto L/M and ker / = 
M/N . Then apply Theorem 9.4.2.] 

6. Let M be an R -module and P, Q submodules of M. Prove that (P + Q/P = 

Q/(P n 2)- 

[Hint. The composite homomorphism Q — > P + Q (P + Q)/P is surjec- 
tive and its kernel is P (T Q. Then apply Theorem 9.4.2.] 

7. Let R be a ring, I an ideal of R, M an 7?-module and A (^0) is a subset of M. 
Show that 

(a) IA = {j:'{ l=]) r l ar.r l e I and at G A} is a submodule of M. 

(b) M/IA is an R/I- module under the action: R/I x M/IA — > M/I A, (r + 
7)(x + /A) rx + /A. 

(c) IM = Efi n i te sum^' x * - a i ^ I and *7 £ 47} is a submodule of M. 

8. Let A, B be submodules of an R -module M. Define (A : B) = {r e R : 
rB c A}. Show that (A : 5) is an ideal of 7?. In particular, define the anni- 
hilator of M denoted by Ann (M) by Ann(M) = (0 : M) = jr e i? : rM = 0}. 
Show that Ann (M) is an ideal of 7?. If / is an ideal of 7? such that / c Ann(M), 
show that M can be made into an R/I module. 

[Hint. Define R/I x M M, ((v),m) xm V(jc) g 7?// and m g M, 
where (v) is represented by v g 7?, i.e., (x) = x + / .] 

9. Let A, 5 be submodules of an R -module M. Show that 

(i) Ann (A + 5) = Ann(A) n Ann(7?); 

(ii) (A :B) = Ann((A + B)/A). 

10. Let M be a finitely generated R -module and I be an ideal of R and let / be an 
R -module endomorphism of M such that f(M) c IM. Show that / satisfies 
an equation of the form 

f n -\-a\f n 1 + • • • + a n = 0 (at G /). 

Let xi,X 2 , ...,x w be a set of generators of M. Then for each 
fixOelM , 

72 72 

/(*;■) = ^aijXj (1 < i < n\ dij el) => ^(<5,:// - aij)xj = 0, 

7=1 7=1 

where is the Kronecker delta. Multiply on the left by the adjoint of the matrix 
/ — ciij). Then it follows that det(5^ / — dij) annihilates each x/, hence it 
is the zero endomorphism of M. An equation of the required form is obtained 
on expansion of the determinant.] 

1 1 . Let M be a finitely generated R -module and I an ideal of R such that I M = M. 
Show that there exists x = 1 (mod I) such that xM = 0. 
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Fig. 9.9 Commutative 
diagram for Five Lemma 



M 4 ► M 5 



n 4 * N 5 


[Hint. Take / = identity in Exercise 10. Then x = 1 + a\ + • • • + a n gives 
the required solution.] 

12. ( Nakayama’s Lemma) Let M be a finitely generated R -module and I an ideal 
of R contained in the Jacobson radical J of R. Then IM = M implies M — 0. 

[Hint. Using Exercise 1 1, xM = 0 for some x = 1 (mod J) =>> x is invertible 
in R (cf. Exercise 23(ii), Chap. 5 ) => M = x~ l xM = 0.] 

13. Let M be a finitely generated /^-module, N a submodule of M and I an ideal of 
R contained in the Jacobson radical J of R. Then show that M = I M + Af =>> 
M = N. 

[Hint. Note that I(M/N) = (. IM + N)/N. Apply Nakayama’s Lemma 
to M/AC] 

14. (The Five Lemma). Let the diagram of R -modules and their homomorphisms in 
Lig. 9.9 be a commutative, with exact rows. Prove the following: 

(a) /i is an epimorphism and / 2 , f\ are monomorphisms =>► fo is a monomor- 
phism. 

(b) fs is a monomorphism and / 2 , fa are epimorphisms => fa is an epimor- 
phism. 

[The lemma in Exercise 17, which is also called the Live Lemma, follows 
from this exercise.] 

15. (a) Let A, B , C, D be 7?-modules and / : C ->► A and g : B ->► D be R- 

module homomorphisms. Then show that the map if/ : Horn/? (A, B) 
Horn/? (C,D) defined by \j/(a) = gaf is a homomorphism of abelian 
groups. 

(b) Let 

A -L B C O (9.4) 

be an exact sequence of R -modules and their homomorphisms. Show that 
the sequence (9.4) is exact VR -modules N , the sequence of abelian 
groups 

O Hom(C, N) A Hom(B, N ) A Hom(A, N ) (9.5) 

is exact. 

(c) Let 


O 


B 


C 


(9.6) 
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Fig. 9.10 The Five Lemma 
diagram 


U k 

> 


h U 

An ** Aa ► A r 


fa 


fa 


fa 


fa 


fa 


9^ 


92 93 9a 

Bn ^ Bn ^ Ba ^ Be 


be a sequence of R -modules and their homomorphisms. Show that the se- 
quence (9.6) is exact VR -modules M, the sequence of abelian groups 

O -+ Hom(M, A) -L Hom(M, B) Hom(M, C) -> (9.7) 

is exact. 

16. If is any submodule of an R -module M, show that the sequence O — > N — > 
M M/N — > O is exact, where i is the inclusion map and tv is the natu- 
ral homomorphism. Conversely, if the sequence of R modules and homomor- 

/ § 

phisms O — ^ A — > M — > C — >► O is exact, then N = Im / = kerg is a sub- 
module of M and N = A and M/N = C. 

17. ( The Five Lemma) If R is a ring and the diagram in Fig. 9.10 is commutative 
with exact rows of R -modules and their homomorphisms such that each fa 
(/ = 1, 2, 4, 5) is an isomorphism, then show that fa is also an isomorphism. 

18. Show that M is a Noetherian R -module iff every submodule of M is finitely 
generated. 

f g 

19. Let O — ^ M f — > M — > M" —> O be an exact sequence of R -modules and 
their homomorphisms. Show that 

(a) M is Noetherian M' and M " are Noetherian; 

(b) M is Artinian M' and M" are Artinian. 

20. If Mi (1 < i < n) are Noetherian (Artinian) 7?-modules, show that ®" z - =1 ^ Mi 
is also. 

21. Let R be a Noetherian (Artinian) ring and M a finitely generated /^-module. 
Show that M is a Noetherian (Artinian) R -module. 

22. Prove the following: 

(a) Let M be a Noetherian (Artinian) R -module and I a submodule of M. Then 
M/I is a Noetherian (Artinian) R -module; 

(b) if an R module M has a submodule I such that both I and M/ 1 are Noethe- 
rian (Artinian), then M is Noetherian (Artinian). 

23. If M (^0) is a finitely generated R -module, show that M has a maximal sub- 
module. 

24. (a) Let M be an R -module and N a submodule of M Show that N is maximal 

in Mo R -module M/N is simple. 

(b) Show that an R -module M is simple O M = R/I for some maximal 
ideal I of R. 
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25 . Let M be an R -module and {Mi }, (i e J) be a family of simple submodules of 

M such that M = ^ • Show that for each submodule N of M there is a 

subset I of J such that M = ©( ze /) Mi 0 N . 

26 . Let M\ M2 M3 M4 M5 M6 be an exact sequence of 
R -modules. Show that the following statements are equivalent: 

(a) /3 is an isomorphism; 

(b) / 2 and are trivial homomorphisms; 

(c) /1 is an epimorphism and fs is a monomorphism. 

/ 

27 . (a) Let the sequence O -> M — > N — > O of P -modules be exact. Show that 

/ is an isomorphism. 

f § 

(b) Let the sequence <9 -> A 1 — > B — > A2 — > O of 7 ? -modules be exact. 
Show that the following statements are equivalent: 

(i) There is an R -module homomorphism h : A2 — ► B with gh = Ia 2 \ 

(ii) there is an P-module homomorphism k : B — > A\ such that kf = 1 a x ; 

(iii) the given sequence is isomorphic (with identity maps on A\ and A2) 

to the short exact sequence O -> A\ — -> Ai 0 A 2 — A2 -> O; in 
particular, 5 = Ai 0 A2. 

28 . An exact sequence 

• • • —> M n > M n+ i ^ M„+2 -» • • • ( 9 . 8 ) 

is said to split at the R -module M n + 1 iff the P-submodule M = lmf n = 
ker/ n+ i of M n + \ is a direct summand of M n + \ i.e., iff M n +\ is decompos- 
able into the direct sum of M and another R -submodule of M n +\. If the exact 
sequence ( 9 . 8 ) splits at M n +\ Vft, the sequence ( 9 . 8 ) is said to split. 

(a) Prove that the sequence ( 9 . 8 ) splits if either (i) or (ii) holds. 

(i) There exists an R -homomorphism h : M n +\ — > M n such that hf n : 
M n M n is an automorphism. 

(ii) There exists an R -homomorphism k : M n + 2 M n + 1 such that f n +\k : 

M n + 2 — > M n + 2 is an automorphism. 

(b) Prove that if the sequence ( 9 . 8 ) splits, then 

(i) M n+ 1 = Im f n 0 Im f n + 1 = M n 0 Im f n+x 

[Hint. Use (a(i)).] 

(ii) M n+ 1 = Im f n 0 Im f n + 1 = Im /„® M„ +2 

[Hint. Use a(ii).] 

29 . Let M, N, and P be given R -modules and M x N be the product of the sets M 
and N. A function / : M x A — >► P is said to be bilinear iff 

+ r 2 ftz 2 , ft) = n) + r 2 /(m 2 , ft) 


404 


9 Modules 


and f(m, r^n\ + r^nf) — r^ffn, n\) + r^f (m, ri2) hold Vr; e R , m, mi e M 
and n, rii e N. 

By a tensor product of M and N, we mean a pair ( T , /), where T is an /?- 
module and f : M x N ^ T is a bilinear function such that for every bilinear 
function g : M x N —> P there exists a unique homomorphism 

h:T -> P satisfying hf = g. 


Prove the following: 

(a) If (T, /) is a tensor product of R -modules M and N, then f(M x N) 
generates T . 

(b) If (T, f) and (P,g) are tensor products of M and N, then there exists a 
unique isomorphism h : T — >► P of R -modules and that hf = g. 

(c) Every pair of R -modules M and N determines a unique tensor product 
(T, f) (up to isomorphism). 

The product is denoted by M <S>r N or by simply M 0 N. 

(d) For any given R -module M,M<g)R = M = R(g)M. 

30. (a) For any /^-modules M, N and P, prove the following isomorphisms. 


(i) 

M@N = 

■NG 

)M; 



(ii) 

(M © N) 

®P 

= M 

© 

(N © P); 

(iii) 

M ®N = 

: N (8 

)M\ 



(iv) 

(M ® N) 

® p 

= M 

© 

(N <g> P); 

(v) 

(M ® N) 

®p 

= M 

© 

P ® N® P; 

(vi) 

P ® (M © N) 

= P 

0. 

M ® P <S)N. 


(b) Fet S be the set of all R -modules. Then show that (S, ®, 0) forms a com- 
mutative semiring. 

[Hint. See Appendix A and use (a).] 

31. Fet G be a free Z-module of rank n and H be a subgroup of G of rankm < n. 
Then the index [G : H] is finite iff m — n. For m=n, let {x\,X 2 , • . . x n } be a 
basis of G and [y\ , y2, . . . y n ) be a basis of H. Write 




( x\\ 


= A 

X 2 

\yn) 


\Xn / 


with A e M n (Z) . 


Then show that [G : H] = | det A\. 

32. Fet R and A be integral domains such that R c A. If R is a Noetherian domain 
and A is a finitely generated R -module, show that A is a Noetherian domain. 

[Hint. Fet I\ c I 2 c • • • be an ascending chain of ideals in A. Then by Propo- 
sition 9.8.1, A is a Noetherian /^-module. Since each I\ is an /^-submodule of 
A, then the above chain must terminate.] 
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33. Let R be a ring with identity, M be an R -module and N be a submodule of M. 
Show that M is Noetherian iff both N and M/N are Noetherian. 

[Hint. Use the Corollary to Theorem 9.8.2 and Theorem 9.8.3.] 

34. ( Chinese Remainder Theorem for Modules ) Let R be a commutative ring with 1 
and I \ , / 2 , . . . , It be ideals in R and M be an R -module. 

(a) The map M -> M/I\M ® • • • ® M/I t M , x (x + I\M , . . . ,x + I t M) is 
an R -module homomorphism with kernel I\M Pi I 2 M D • • • Pi I t M\ 

(b) if the ideals I \ , I 2 , . . . , It in (a) are pairwise co-prime (i.e., f + I j = R for 
all i / j), then 

M/(hh • • • I t )M = M/hM ® • • • ® M/I t M. 

[Hint. The proof is similar to the proof of Chinese Remainder Theorem for 
rings.] 

35 . The concept of Lie algebra can be extended for modules over commutative rings 
with identity element. 

Let M be a module over a commutative ring with identity element. Then a 
bilinear map M x M -> M , (a, b) \-> [a, b] is said to make M a Lie algebra iff 
it satisfies the condition [a, a] = 0, Va e M and the Jacobi identity. 

(a) Let M n (R) be the ring of square matrices of order n over a commutative 
ring R with 1. If x, y e M n (R ), then the bilinear map M n (R ) x M n (R ) — > 
M n (R ), (v, y) i-> [x, y] = vy — yv makes M n (R ) into a Lie algebra. 

(b) Let H = [X e M n (R ): trace X = 0} is also a Lie algebra and a (Lie) subal- 
gebra of M n (R). 

36. A nonzero ring R is said to be right (left) semisimple iff it is semisimple as a 
right (left) module over itself. Prove that the following statements are equivalent 
for a ring R with 1 . 

(a) R is right semisimple; 

(b) R is isomorphic to a direct sum of a finite number of matrix rings of square 
matrices over division rings; 

(c) R is left semisimple. 

The result is known as Wedderbum-Artin Theorem. 

37. Show that the following statements are equivalent for a semisimple module M\ 

(a) M is Artinian; 

(b) M is Noetherian; 

(c) M is a direct sum of a finite number of minimum modules M. 

38. Show that a semisimple module is finitely generated iff it is a sum of finitely 
many minimal submodules. 

39. (a) Show that a simple R -module M is free iff R is a division ring and M is of 

dimension 1 over R ; 

(b) Show that 
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(i) a matrix ring R = M n (D) of square matrices of order n over a division 
ring D is semisimple; 

(ii) the matrix ring R = M n (Z) is not semisimple; 

(iii) a commutative ring is semisimple iff it is a finite direct product of 
fields. 


9.11 Homology and Cohomology Modules 

In this section we continue discussion of exact sequences. 

The concept of homological algebra arose through the study of algebraic topol- 
ogy during the middle of the 20th century. Homological algebra borrows the lan- 
guage of algebraic topology, such as homology groups, homology modules, chains, 
boundaries, chain complex etc. The concept of cohomology modules is dual to that 
of homology modules. They are widely used in commutative algebra, algebraic ge- 
ometry and algebraic topology. 

Definition 9.11.1 A sequence M = {M n , d n }, n e Z, of R -modules M n together 
with a sequence of /?-homomorphisms d n : M n — > M w _i, such that d n o d n +\ = 0, 
Vne Z, is called a chain complex and d n is called a boundary homomorphism. 
More precisely, 


M : * M n+ i ^4 M n -—>■ M n -\ —»•••• (9.9) 

is called a chain complex iff d n o d n +\ = 0, Vft e Z. 

Definition 9.11.2 The elements of Z n = ker3„ are called n-cycles , the elements of 
B n = Im3 n+ i are called n-boundaries and the elements of M n are called n -chains 
of the complex (9.9). 

Proposition 9.11.1 B n is a submodule of Z n , Vn e Z. 

Proof It follows from the condition of a chain complex M that o = 0, 
Wne Z. □ 

Definition 9.11.3 The quotient module Z n /B n for any chain complex M denoted 
H n (M ) (or simply H n ) is called the ft -dimensional homology module of the chain 
complex M. For R = Z, we get the homology groups H n (M). The complex M said 
to be a cyclic iff H n (M) = 0 for Vft e Z. The elements of H n = Z n /B n are called 
homology classes , denoted [ z ] for every z € Z n . 

Remark If the homology module H n (C) = 0, then the sequence (9.9) is exact at M n . 
This shows that the homology module of a chain complex measures its deviation 
from the exactness of the sequence (9.9). 
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Fig. 9.11 Commutative 
diagram of two chain 
complexes 



77-1 


Definition 9.11.4 Let M = {M n , d n }, n e Z and M' = { M ' n , 3' }, n e Z be two 
chain complexes of R -modules. Then a sequence {f n : M n — > Af'J, n € Z of /?- 
homomorphisms is called a chain map from M to M' iff these 7? -homomorphisms 
commute with the boundary homomorphisms i.e., iff each square in the diagram of 
Fig. 9.11 is commutative, i.e., /„_ \ o 3 n = 3 ' o /„, Vzz e Z. 

We abbreviate the entire collection to f : M —> M' and call / a chain map. 

Proposition 9.11.2 Let M = {M n , d n } and M r = {M' n , 3' } be two chain complexes 
of R -modules and {f n : M n -> M'} a c/zazzz mrz/?. TTzezz /„ maps n-cycles of M 
into n-cycles of M' and n-boundaries of M into n-boundaries of M\for all n e Z. 

Proof Left as an exercise. □ 

Proposition 9.11.3 Let M = {M n , d n } and M ' — {M' n , 3' } be two chain complexes 
of R-modules and {f n : M n — > M' } a chain map. Then each f n : M n — > M' 
determines an R -homomorphism 


H n (fn) = fn * ■ H n (M) H n (M% [z] h* [/„(*)]. 


Proof Left as an exercise. 


□ 


Definition 9.11.5 H n (f) = f n * : H n (M) H n (M f ) is called the module homo- 
morphism (or homomorphism) in homology induced by f n for each n eZ. 

We simply write / and /* is places of /„ and /„*, respectively, unless there is 
no confusion. 

Proposition 9.11.4 (a) If f : M —> M' and g : M’ M" are Z1v6> chain maps , ^/zezz 
/7zez> composite g o f : M -> M" is a chain map such that (g o /)* = g* o /* : 


H n (M) — > H n (M")\ 


(b) if Im • M — > M zs £/z^ identity chain map , £/z^zz (/m)* • H n (M) —> H n (M ) zv 
also the identity homomorphism. 


Proof Left as an exercise. 


□ 


Definition 9.11.6 Let M = {M n , 3^} and TV = {N n , 3' } be two chain complexes 
and /, g : M -> TV be two chain maps. Then / is said to be chain homotopic to g, 
denoted f — g, iff 3 a sequence {F w : M n -> TV w +i} of 7? -homomorphisms such that 


408 


9 Modules 


A chain map / : M — > TV is called a chain homotopy equivalence iff 3 a chain 
map g : N -> M such that go/~/^an/o^~/jv. 

Proposition 9.11.5 relation of chain homotopy on the set C(M, N ) c/zam 

maps from M to N is an equivalence relation. 

Proof Left as an exercise. □ 

Proposition 9.11.6 Two homotopic chain maps f,g:M^N induce the same ho- 
momorphism in the homology , i.e., f ~ g : M — > A =>► /* = g* : H n (M ) — > H n (N ) 
Vn e Z. 

Proof Let f ~ g : M -> N. Then 3 a chain homotopy {F„ : M n A n+ i}. Let 
[z] e tf„(M). Then 3„([z]) = 0 => /„([*]) - *„([*]) = 9n+i^(W) is a bound- 
ary =>► [/*b]] = [g/ib]] =>► /n * (bl) = * M, v[z] g iz*(M) =» /„* = 

f* = g*. n 

A cohomology module is the dual to homology module. A cochain complex, 
cocycles, coboundaries, cochains, cohomology classes and a cohomology module 
are defined dually as follows: 

Definition 9.11.7 A sequence M* = { M n , 8 n }, n e Z, of R -modules M n together 
with a sequence of /?-homomorphisms 8 n : M n —> M w+1 , such that 8 n o <5 W_1 = 0, 
Vn G Z is called a cochain complex and 8 n is called a coboundary homomorphism. 
More precisely, 


M* : * M” -1 ^4 M n -L M n+1 • • • (9.10) 

is called a cochain complex iff 8 n o 8 n ~ l = 0 ,Vne Z. 

Definition 9.11.8 The elements of Z n = ker<$ n are called n-cocycles , the elements 
of = Im<5 n_1 are called n-coboundaries and the elements of M n are called n- 
cochains of the cochain complex (9.10). 

Proposition 9.11.7 B n is a submodule ofZ n , Vn e Z. 

Proof It follows from the condition of a cochain complex M* such that 8 n o 

r^o.v/ie z. □ 

Definition 9.11.9 The quotient module Z n /B n for any cochain complex M*, de- 
noted H n (M *) (or simply H n ) is called the n-dimensional cohomology module 
of the cochain complex M*. For R = Z, we get the cohomology group H n (M*). 
The elements of H n = Z n /B n are called cohomology classes , denote [z] for ev- 
ery [ Z ] € Z n . 
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Definition 9.11.10 Let M* = {M n ,8 n } and M f * = {M rn ,8 rn } be two cochain 
complexes of R -modules. Then a sequence {f n : M n M ,n }, n e Z of R- 
homomorphisms is called a cochain map from M* to M'* iff these /?-homomor- 
phisms commute with the coboundary homomorphisms i.e., iff each square in the 
diagram of Fig. 9.12 is commutative, i.e., f n+l o 8 n = 8 fn o f n , Wn e Z. 

We can define a cochain homotopy between two cochain maps in a similar way 
as this definition. 


9.11.1 Exercises 

Exercises-II 

1. Show that a chain complex M = { M n , 3^} of 7?-modules and their R -homomor- 
phisms is exact iff H n (M) = 0 ,VnG Z. 

[Hint. Z n = B n O ker d n = Im d n+ \ .] 

2. Let M and TV be two chain complexes of R -modules and R -homomorphisms 
and / : M — > N is a chain homotopy equivalence. Show that H n (f) = /* : 
H n (M) -> (AO is an ^-isomorphism for all n e Z. 

[//mf. Let / : M — >► Af be a chain homotopy equivalence. Then 3 a chain 
homotopy equivalence g : N ^ M such that g o / ~ 1 M and / o g ~ . 

Then (g o /)* = g* o /* = Id and (/ o g)* = /* o g* = Id =>► /* is an iso- 
morphism of R -modules.] 

3. Let f,g:M^N and h, k : Af — >► P be chain maps. Show that f — g and h ~ 
k=^hof~kog:M^ P, i.e., the composites of chain homotopic maps are 
chain homotopic. 


9.12 Topology on Spectrum of Modules and Rings 

In this section we topologize the set of all prime ideals of rings and prime submod- 
ules of modules by defining closed sets. Moreover we study the properties of such 
spaces, with special reference to Zariski topology and schemes. 


9.12.1 Spectrum of Modules 

Let R be a ring and M an R -module. Define Spec M as the set of all prime sub- 
modules of M and call it as the spectrum of M. For any prime submodule K of M, 
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put 

V(K) = {N e Spec M : (N : M) D (K : M)} 

(A proper submodule N of M is said to be prime iff re e N , for r e R, e e M 
either e e N or r e (N : M).) 

(a) Show that X = Spec M satisfies the following properties: 

(i) y (0) = Spec M and V(M) = 0; 

(iii) V ( K \ ) U V (K 2 ) = V (K\ H K 2 ), where K t ’s are submodules of M. 

These results show that the sets V ( K ) satisfy axioms for closed sets in a 
topological space. Then Spec M becomes a topological space. 

(b) Prove the following: 

(i) Let Y be a closed subset of SpecM. Then Y is irreducible (i.e., every pair 
of non-empty closed sets in Y intersect) Y = V (N) for some prime 
submodule N of M (this N is unique in the sense that any other submodule 
N' of M with Y = V(N') => (N : M) = (N ' : M)). 

(ii) Let M be a finitely generated R -module. Then X = SpecM is irre- 
ducible Pl(A^eX) N is a prime submodule of M. 

(iii) M is a Noetherian /^-module => Spec M is a Noetherian space. 

[ Hint. X\ D X 2 2 • • • be any decreasing sequence of closed subsets of 
specM. Then we can write X( = V (Ki) for every i. Define K- = C\^eXi) N- 
Then V (Ki) = V (K-) and {K-} is an increasing sequence of submodules of M. 
Since M is a Noetherian R -module, this sequence becomes stationary and hence 
Spec M is a Noetherian space.] 


9.12.2 Spectrum of Rings and Associated Schemes 

The prime spectrum or simply spectrum of a ring R is the set Spec R of all prime 
ideals of R , i.e., Spec R = {P : P is a prime ideal of 7?}. Let Ma x(R) = {M : M is 
a maximal ideal of R}. We now consider commutative rings with identity element. 
Then by Theorem 5.3.3, Max(/?) c Spec R. Ma x(R) is called the maximal spectrum 
of R. 

Let R and T be two commutative rings with identity elements and / : R T 
be a ring homomorphism. Then / induces a function Spec / : SpecT — > Spec R, 
defined by 

(Spec f)(Q) = f~ l (Q), V<2 e Spec T . (9.11) 

Q is a prime ideal of T => T / Q is an integral domain by Theorem 5.3.1. Now 
the ring R/f~ l (Q ) is isomorphic to a subring of T/Q =>► R/f~ l (Q ) in an integral 
domain =^/ -1 (2)isa prime ideal of 7? => Spec / is well defined. 

For any set IC7?, define V (A) = {fe Spec R:X<op}. 

If / is the ideal generated by X, then V(X) = V(I) = V (rad/). 
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Spec R satisfies the following properties: 

(i) V(0) = Spec R and V(/?) = 0; 

(ii) D 06 a V(X a ) = V({J aeA X a ), where {X a } ae A is any family of subsets of R. 

(iii) V (I\) U V (I 2 ) = V (1 1 f! I 2 ), where I\ and I 2 are prime ideals of R. 

Thus the family of subsets of Spec R which are of the form V ( X ) is closed with 
respect to any intersection and finite unions, contains the empty set 0 and the entire 
set Spec R. Hence there exists a unique topology on the set Spec R in which closed 
sets are of the form V (X) for X c R. 

The set Spec R of all prime ideals of R , together with the family {V (X)xcj?} as 
the class of closed sets, forms a topological space. This topology of Spec R is called 
the Zariski topology. 

Problem 1 The topological space Spec R endowed with Zariski topology is a 
Hausdorff space iff Spec R = Ma x(R). 

Problem 2 The induced function Spec / : Spec T — > Spec R , defined by (9.1 1) is 
a continuous function with respect to Zariski topology. 

Solution Given a set X c R, (Spec f)~ l (V(X)) = {Q e Spec T : Spec/ (Q) e 
V(X)} = IQ e Speer : f~\Q) e V(X)} = (ge Spec T : X c f~\Q)} = {Q e 
Spec T : /(X) c Qj = V(f(X)) Spec / is continuous. 

Definition 9.12.1 The spectrum of a ring, endowed with Zariski topology is called 
a scheme. 

The name ‘ scheme ’ was given by Alexander Grothendieck. He developed the 
theory of schemes which made a revolution in algebraic geometry. He was awarded 
‘Fields Medal’ in 1966 in recognition of his contribution to the theorey of schemes. 
An affine scheme comes with a ‘sheaf of rings’. 

For further results of this section see Ex. 12 of Appendix B. 


9.13 Additional Reading 

We refer the reader to the books (Adhikari and Adhikari 2003; Atiyah and MacDon- 
ald 1969; Blyth 1977; Herstein 1964; Hilton and Stammbach 1971; Lambek 1966; 
Lang 1986; Musili 1992) for further details. 
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Chapter 10 

Algebraic Aspects of Number Theory 


There is no doubt, number theory has long been one of the favorite subjects not only 
for the students but also for the teachers of mathematics. It is a classical subject and 
has a reputation for being the “purest” part of mathematics, yet recent developments 
in cryptology, coding theory and computer science are based on elementary number 
theory. Number theory, in a general sense, is the study of numbers and their prop- 
erties. In Chap. 1, we have already seen some of the basic but important concepts 
of integers, such as, Peano’s Axiom, well ordering principle, division algorithm, 
greatest common divisors, prime numbers, fundamental theorem of arithmetic etc. 
In this chapter, we discuss some more interesting properties of integers, in partic- 
ular properties of prime numbers, and primality testing. In addition, we study the 
applications of number theory, particularly those directed towards theoretical com- 
puter science. Number theory has been used in many ways to devise algorithms for 
efficient computer and for computer operations with large integers. Both algebra 
and number theory play an increasingly significant role in computing and commu- 
nication, as evidenced by the striking applications of these subjects to the fields of 
coding theory and cryptology. The motivation of this chapter is to provide an intro- 
duction to the algebraic aspects of number theory, mainly the study of development 
of the theory of prime numbers with an emphasis on algorithms and applications, 
which would be necessary for studying cryptology to be discussed in Chap. 12. In 
this chapter, we start with the introduction to prime numbers with a brief history. 
We provide several different proofs of the celebrated theorem by Euclid stating that 
there exist infinitely many primes. We further discuss Fermat number, Mersenne 
numbers, Carmichael numbers, quadratic reciprocity, multiplicative functions, such 
as, Euler 0-function, number of divisor functions, sum of divisor functions. This 
chapter ends with the discussions on primality testing both deterministic and prob- 
abilistic, such as, Solovay-Strassen and Miller-Rabin probabilistic primality tests. 


10.1 A Brief History of Prime Numbers 

Prime numbers belong to an exclusive world of intellectual conceptions. Around 
2500 years ago, the ancient Greeks in the school of Pythagoras were interested in 
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numbers for their mystical and numerological properties. They made the distinc- 
tion between composite numbers and prime numbers. They understood the idea 
of primality and were interested in perfect numbers (a number whose sum of the 
proper divisors is the number itself, e.g., 6, 28 etc.) and pairs of amicable numbers 
(i.e., a pair of numbers such that the proper divisors of one number is the sum to the 
other and vice versa, e.g., 220 and 284). 

The book Elements written by Euclid, an ancient Greek mathematician, appeared 
in about 300 BC. In the Book IX of the Elements, Euclid included a proof that there 
are infinitely many primes. This is considered to be one of the first proofs known 
which uses the method of contradiction to establish a result. Euclid also came up 
with a proof of the Fundamental Theorem of Arithmetic which states that “Every 
integer can be written as a product of primes in an essentially unique way”. This 
factorization can be found by trial division of the integer by primes less than its 
square-root. Euclid also proved that a number 2 n ~ l (2 n — 1) is a perfect number if 
the number 2^ — 1 is prime. Euler showed that all even perfect numbers are exactly 
of this form. Till date, it is not known whether there are any odd perfect numbers. 

In 17th century, Pierre de Fermat (1601-1665) took a leading role for the ad- 
vancement of the theory of prime numbers. He proved a speculation of Albert Girard 
that every prime number of the form 4n + 1 can be written in a unique way as the 
sum of two squares. He further proved a theorem which is now known as Fermat’s 
Little theorem. The theorem states that if p is prime and a be an integer such that 
gcd(a, p) = 1, that is a and p are relatively prime, then, p divides a p ~ l — 1. Fer- 
mat’s Little theorem is the basis for many results in number theory and is one of the 
basis for checking primality testing techniques which are still in use on today’s elec- 
tronic computers. He discovered a technique for factorizing large numbers, which he 
illustrated by factorizing the number 2027651281 = 44021 x 46061. Although Fer- 
mat claimed that he proved all of his theorems, a very few records of his proofs have 
survived. Many mathematicians, including Gauss, cast doubts on his several claims, 
because of the difficulty of some of the problems and the limited mathematical tools 
available to Fermat. Out of his claims, Fermat’s Last theorem is the most celebrated 
one. The theorem was first proposed by Fermat in the form of a note scribbled in 
the margin of his copy of the ancient Greek text Arithmetica by Diophantus. The 
theorem states that the Diophantine equation x n + y n = z n has no integer solutions 
for n > 2 and Fermat left no proof of the conjecture for all n , except 

for the special case n = 4. In the note, Fermat claimed that he discovered a proof 
of the whole theorem, but due to the small space of the margin he could not write 
the proof. It was called a “theorem” on the strength of Fermat’s statement, despite 
the fact that mathematicians failed to prove it for hundreds of years. No successful 
proof was published until 1995, when Andrew Wiles came up with a correct suc- 
cessful proof the theorem. This single theorem simulates many new development of 
algebraic number theory in the 19th century and the proof of the modularity theorem 
in the 20th century. 

Factorization is considered to be another interesting research area in the field of 
number theory. Though Euclid came up with a proof of the Fundamental Theorem 
of Arithmetic, the factorization are found by trial division of the integer by primes 
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less than its square-root. For large enough numbers, clearly this method is ineffi- 
cient. Fermat, Euler, and many other mathematicians have produced imaginative 
factorization techniques. However, using the most efficient technique and modern 
computer facility yet devised, billions of years of computer time may be required to 
factor a suitably chosen integer even with 300 decimal digits. 

From the very early development of prime numbers, mathematicians were very 
much interested in finding formulas that can generate primes. Fermat conjectured 
that the numbers 2^ + 1 are always prime if n is a power of 2. He verified this for 
n = 1, 2, 4, 8, and 16. Numbers of this form are called Fermat numbers. But a cen- 
tury after, his claim was shown to be wrong by the renowned Swiss mathematician 
Leonard Euler, who discovered that 641 is a factor of 2 25 + 1 = 4294967297. 

The German mathematician Carl Friedrich Gauss, considered to be one of the 
greatest mathematicians of all time, developed the language of congruences in the 
early 19th century. When doing certain computations, integers may be replaced by 
their remainders when divided by a specific integer, using the language of congru- 
ences. Many questions can be phrased using the notion of a congruence that can only 
be awkwardly stated without this terminology. Congruences have diverse applica- 
tions to computer science, including applications to computer file storage, arithmetic 
with large integers, and the generation of pseudo-random numbers. 

The problem of distinguishing primes from composites, known as primality test- 
ing, has been extensively studied. The ancient Greek scholar Eratosthenes discov- 
ered a method, now called the sieve of Eratosthenes. This method finds all primes 
less than a specified limit. Ancient Chinese mathematicians believed that the primes 
were precisely those positive integers n such that n divides 2 n — 2. Fermat proved 
one part of the belief by showing that if n is prime, then n divides 2 n — 2. The other 
part of the belief was proven to be wrong, in the early 19th century, by showing 
the existence of composite integers n such that n divides 2 n — 2, such as n = 341. 
However, it is possible to develop probabilistic primality tests based on the original 
Chinese belief. In current days, it is now very well possible to efficiently find primes; 
in fact, primes with as many as 400 decimal digits can be found in a few minutes 
of computer time. For example, 1152163959944035801782591958215936605022 
9133173385690566201793635642 860209511722327 75708607210235066044390 
98144500755041680003586821792652531770745659087273762673860 8284693 
389055294664676473854093123176962687 1 80492223 1181 205228833608035170 
680964347 442368708701479356815648063495583572596496615099354396442 
21493 is a prime number having 309 digits. It is generated by using computers 
through the SAGE software. These large primes numbers are very useful in the con- 
text of cryptology. The problem of efficiently determining whether a given integer is 
prime or not was a longstanding and challenging problem. However, some efficient 
probabilistic algorithms for primality testing are available, such as Miller-Rabin 
test, Solovay-Strassen test etc. The breakthrough came in 2002 as Agrawal, Kayal, 
and Saxena came up with a deterministic efficient algorithm for primality testing. 
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10.2 Some Properties of Prime Numbers 

Theorem 10.2.1 Every positive integer greater than 1 has a prime divisor. 

Proof If possible, let there be a positive integer greater than 1 having no prime di- 
visors. Let S = {x : v is a positive integer greater than 1 having no prime divisors}. 
Then S (^0) c N + . Thus by the well ordering principle, S must have a least ele- 
ment, say l. Since l has no prime divisors and / divides /, l cannot be a prime integer. 
Thus l = ab, where 1 < a < l and 1 <b <1. As a < l, a must have a prime divisor, 
say v. As, a divides /, v must divide /, contradicting the fact that / does not have 
a prime divisor. Thus we conclude that every positive integer greater than 1 has a 
prime divisor. □ 

A natural question that comes to our mind: Is the set of all primes finite? The 
answer to this question was given long back by Euclid by proving the following 
celebrated theorem in his Book IX of Elements. 

Theorem 10.2.2 There are infinitely many primes. 

Proof If possible, let there exist only k primes: say p\ < p 2 < • • • < Pk- Let N = 
PlP2 • • • Pk + L Then N > 1 and hence by Theorem 10.2.1, N has a prime factor, 
say p. Then p must be one of p \ , /? 2 , . . .,/?&. Thus p divides P 1 P 2 • • • Pk = N — 1. 
Hence p divides N — (N — 1) = 1, a contradiction, proving that there exist infinitely 
many primes. □ 

Corollary 1 Let p n denote the nth prime. Then p n < 2 2 ” 1 ,for n > 1. 

Proof We shall prove the result by strong mathematical induction. Clearly, the result 
is true for ft = 1 . We assume that the result is true for n = 2, 3, . . . , k. We shall prove 
the result for n = k + 1 . From Theorem 10.2.2, we have seen that N = p\ p 2 • • • Pk + 
1 is divisible by a prime p and that prime p is not equal to any of the pi ’s, i e 
{1, 2, . . . , k). So, the prime p must be greater than equal to Pk +\ • Thus Pk+\ < p < 
P\P2 • • • Pk + 1 < 2 2 °2 21 • • • 2 2 ^ 1 + 1 = 2 lk ~ l + 1 = ^2 2 ^ + 1 < 2 2 ^ . So the result 
is true for k + 1 . Hence by the strong mathematical induction, the result is true for 
all ft > 1 . □ 

Remark The estimation given above is very weak as p n is much smaller than 2 2 " 1 . 
For example, for n = 5, ps = 1 1 while 2 = 65536. 

Corollary 2 For x > 0, let n(x) denote the number of primes less than or equal 
to x. Then n(x) > Llog 2 (log 2 x)J + 1. 

Proof Note that [log 2 (log 2 x)] + 1 is the largest integer n such that 2 2 ” 1 < v. Then 
by the above Corollary 1, there are at least n primes, say p\, /? 2 , . . . , p n such that 
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Pi < 2 2 " ' , for / = 1,2 n. All these primes are less than or equal to a and so 

7T(x) >n = Llog 2 (log 2 x)J + 1. □ 

? 4 

Remark As before, this bound is also very weak. For example, let x = 2 . Then 
Llog 2 (log 2 x)J + 1=5, while n{x) = 6542. 

In the literature of number theory, there have been many proofs of Theo- 
rem 10.2.2. In this chapter, we shall provide several such proof techniques. 

In 1878, Kummer came up with a very simple proof. 

Alternative Proof of Theorem 10.2.2 Suppose that there exist only finitely many 
primes, p\ < p 2 < • • • < p r - Let N = p\p2 • • • p r >2. Let us consider the integer 
N — 1 > 1. Then by Theorem 10.2.1, N — 1 must have a prime divisor, say /?;, 
common with N. So, pi divides both N and N — 1, resulting in: pi divides N — 
(N — 1) = 1, a contradiction. Hence the number of primes is infinite. □ 

In 1915, H. Brocard gave another simple proof that was published in the Inter- 
mediate des Mathematiciens 22, page 253 , attributed to Hermite. 

Alternative Proof of Theorem 10.2.2 Let us consider an integer s n = n\ + 1, n > l. 
Then by Theorem 10.2.1, s n must have a prime divisor, say p n . We claim that 
p n > n. If not, i.e., if p n < n, then p n divides n\ = s n — 1, leads to the fact that 
p n divides 1. Thus we find a prime larger than n , for every positive integer n. This 
implies that there exist infinitely many primes. □ 

Alternative Proof of Theorem 10.2.2 If possible, let there exist finitely many primes. 
Let P be the product of all these primes. Let P = ab be any factorization of P with 
positive integers a and b. Now for any prime p , either a or b is divisible by p , but 
not both. This implies that the positive integer a + b cannot have any prime factor, 
contradicting Theorem 10.2.1. □ 

In 1917, Metro gave another simple proof. 

Alternative Proof of Theorem 10.2.2 If possible let there exist only k primes: say 
Pi < P2 < ■ • • < Pk- Let P be the product of all these primes, i.e., P = p\P2 • • • Pk- 
For each i = 1 , 2, . . . , k, let Qi = P / p t . Then pi does not divide Qi for each i, while 

Pi divides Qj for i ^ j . Let S = Q\ + Q 2 H F Qk • As S > 1, by Theorem 10.2.1, 

S must have a prime divisor, say q. Then clearly, q / pi , for all i = 1,2 ,...,£, 
contradicting the fact that there exist only k primes, p\ , p 2 , . . . , Pk- Thus there exist 
infinitely many primes. □ 


Euler gave an indirect proof of Theorem 10.2.2. 
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Alternative Proof of Theorem 10.2.2 If possible, let there exist finitely many primes, 
say p \ , p 2 , ..., p n - For each / = 1, 2, . . . , n consider, 


oo 1 

V — = — 
toPi d 


¥ 


Multiplying these n equalities, we obtain 


n oo 


nr 


i 


i = 1 v Pi J 


where the left hand side is the sum of the inverses of all the natural numbers, each 
counted once — this follows from the fundamental theorem of arithmetic that every 
natural number, greater than 1 , can be represented as the product of the primes in 
a unique way. But the series \ * s divergent. Being a series of positive terms, 

the order of the summation is irrelevant. So, the left hand side is infinite, while the 
right hand side is clearly finite. This is absurd. □ 


Another famous result of number theory deals with the existence of infinitely 
many primes in arithmetic progressions. The result is known as Dirichlet’s Theorem 
on primes in arithmetic progressions. 


Statement of the Theorem Let a and b be relatively prime positive integers. Then 
the arithmetic progression an + b, n = 1, 2, 3, . . . contains infinitely many primes. 


G. Lejenue Dirichlet proved the theorem in 1837. But the proof technique is 
out of the scope of this book. However, a special case of Dirichlet’s Theorem is 
illustrated as follows. 


Theorem 10.2.3 There are infinitely many primes of the form 4£ + 3 where k = 

0 , 1 , 2 ,.... 

Proof If possible, let there exist only finitely many primes, say p\, p 2 , . . . , p r of 

the form 4k + 3, k = 0, 1, 2, Let A = 4p\p2 • • • p r — L Then A is of the form 

4k + 3 where k is a non-negative integer. Let p be a prime divisor of A. As A 
is odd, the form of p is either 4£ + 1 or 4k + 3 for some integer k. Note that if 
each of the prime divisors p of A is of the form 4k + 1 , then A must also have the 
same form, contradicting the fact that A is of the form 4k + 3. Thus A is divisible 
by at least one prime p of the form 4k + 3. So by hypothesis, p = pi for some 
i = 1, 2, . . . , r. This implies that p divides 4p\p2 - • • p r — A = 1, a contradiction, 
proving the fact that there are infinitely many primes of the form 4k + 3 where 
k = 0,1,2,.... □ 


10.2 Some Properties of Prime Numbers 


419 


10.2.1 Prime Number Theorem 

Euclid’s Theorem on the infinitude of the primes is considered to be the first result 
on the distribution of primes. In 1737 Euler went a step further and proved that, 
in fact, the series y + y + ^ + y + yj- + -- -, ^ e *’ ser i es °f the reciprocals of 
the primes, diverges. On the other hand, Euler further observed that the rate of di- 
vergence of this series is much slower than the rate of divergence of the harmonic 

series {+ 5 + 7 + 5+53 • This statement appears to be the earliest attempt to 

quantify the frequency of the primes among the positive integers. 

In 1793, Gauss conjectured that 


n^oo n/logn 

where tt (n) denotes the number of primes not exceeding a given positive integer n. 
This result is called the Prime Number Theorem (PNT) which describes the asymp- 
totic distribution of the prime numbers. Gauss further observed that the logarithmic 
integral li(v) = seemed to provide a very good approximation for jt(x). In 

1986, Jacques Hadamard and Charles Jean de la Vallee-Poussin independently came 
with a proof of the prime number theorem. 

Remark Already we have shown that there are infinitely many primes. Now the 
following theorem will show that there are arbitrary long runs of integers containing 
no primes. 

Theorem 10.2.4 Given any positive integer n , there exist n consecutive composite 
integers. 

Proof Consider the integers, (n + 1)! + 2, (n + 1)! + 3, . . . , (n + 1)! + n, (n + 1)! + 
n + 1. Every one of the above integers is composite, since j divides (n + 1)! + j if 
2 < j < n + 1 . □ 

Example 10.2.1 Note that 8! + 2,8! + 3,...,8! + 8 are 7 consecutive composite 
integers. However, these are much larger than the smallest seven consecutive com- 
posites: 90, 91, 92, ... , 96. 


10.2.2 Twin Primes 

Theorem 10.2.4 shows that there are arbitrary long runs of integers containing no 
primes. On the other hand, primes may often close together. Note that the only 
consecutive primes are 2 and 3. However, there may exist pairs of primes which 
differ by 2. These pairs of primes are known as twin primes. The pairs (3,5), 
(5,7), (11, 13), (17, 19), (29,31), (41,43), (59,61), (71,73), (101,103), (107,109), 
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(137,139), (149,151), (179,181), (191,193), (197,199), (227,229) are some of the 
examples of twin prime pairs. 

The twin primes are used to define a very special type of constant, known as 
Brun’s constant. In 1919, Viggo Brun proved that the sum of the reciprocals of all 

the twin primes, i.e., (i + i) + (i + y) + (jj + ^)H , converges to a finite value, 

known as Brun’s constant. In contrast, the series of all prime reciprocals diverges to 
infinity. 

Despite this proof by Viggo Brun regarding Brun’s constant, the question of 
whether there exist infinitely many twin primes has been one of the great open ques- 
tions in number theory for many years. The famous twin prime conjecture states 
that there are infinitely many primes p such that p + 2 is also prime. In 1849 de 
Polognac made the more general conjecture that for every natural number k, there 
are infinitely many prime pairs p and p' such that p' — p = 2k. When k = 1, we get 
the twin prime conjecture. 


10.3 Multiplicative Functions 

In number theory, multiplicative functions play an important role. In this section, 
we first study some general properties of this multiplicative function. Then we shall 
discuss some special multiplicative functions, such as, Euler phi-function, the num- 
ber of positive divisor function and the sum of divisor function. Let us start with 
some definitions. 

Definition 10.3.1 A function that is defined for all positive integers is called an 
arithmetic function. 

We are interested in a particular type of arithmetic function which leads to the 
following definition. 

Definition 10.3.2 An arithmetic function / is said to be a multiplicative function 
iff Wa, b e N + with gcd (a, b) = 1 we have f(ab) = f(a)f{b). 

Example 10.3.1 The functions /, g : N + — > N + defined by f(n) = 1 and g(n) =n, 
Vn e N + are examples of multiplicative functions. 

Given the prime factorization of a positive integer, a simple formulation for a 
multiplicative function can be found through the following theorem. 

Theorem 10.3.1 Let f be a multiplicative function and for the natural number n , 
let n = p® 1 p® 2 • • • Pr r be the prime factorization ofn , where pi ’s are distinct primes 
cindcii > l, fori = 1, 2, r. Then /(/?“' p“ 2 ■ •• p“ r ) = 


Proof We shall prove the result by using the principle of mathematical induc- 
tion on r. For r = 1, the result is obvious. So, we assume that the result is 
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true for some k > 1, i.e., / (p“‘ /?“ 2 ‘ ' ' pT) = /(/’“ 1 )/(/’ 2 2 ) ' ‘ ‘ /(/>“*)• Now ’ 
as gcd(^p-...^,^Y) = 1. f(P?P a 2 ---P a kPl+x) = f(P?P 2 2 ---P?) 
f(Pk+_ i‘) = /(K 1 )/^ 2 ) • • • f(Pk k ')f(Pk+i')’ P rovin 8 the result for fe + 1. Hence 
by the principle of mathematical induction, we have the required result. □ 

Next we shall prove another important result that will be very useful in proving 
other results on multiplicative functions of a special type. 


Theorem 10.3.2 Let f be a multiplicative function. Then the arithmetic function F 
defined by F(n) = ^f d \ n f(d) is also a multiplicative function. 

Proof Let gcd(a, b) = 1. Note that for each pair of divisors d\ of a and J 2 of b, 
there corresponds a divisor d = d\d 2 of ab. On the other hand each positive divisor 
of ab can be written uniquely as the product of relatively prime divisors d\ of a and 
J 2 of b. Thus 


F(ab) = J2 fo) = J2 f^ d 2 ) = E fwtifm 

d\ab d\ \ a di\a 

d2\b d2\b 

= E fW l) E / ( rf 2) = F(a)F(b), 

di\a d2\b 

proving that F is a multiplicative function. □ 

Next we discuss a special type of multiplicative function, known as Euler phi- 
function, which is not only very useful in number theory but also plays an important 
role in constructing many important cryptographic schemes. 


10.3.1 Euler phi-Function 

Among all the multiplicative functions, Euler phi-function is the most important 
function in number theory. This function, named after the great Swiss mathemati- 
cian Leonhard Euler (1707-1783), is defined as follows. 

Definition 10.3.3 For a positive integer n , the Euler phi-function, denoted by 0 (n ) , 
is defined to be the number of positive integers not exceeding n and relatively prime 
to n. 

Euler phi-function and the ring , for ft > 1 , have a nice connection. To deduce 
the relation, let us first prove the following lemma. 

Lemma 10.3.1 An equivalence class (a) in Z n is a unit in 7u n if and only if 
gcd (a, n) — 1. 
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Proof Let (a) in Z n be a unit in Z n . Then there exists an equivalence class ( b ) in Z n 
such that (< a)(b ) = (1). Thus ab = 1 (modw), i.e., ab = l+ kn for some integer k. 
This implies ab + n(—k) = 1, with the result that gcd (a, n) = 1, by the Corollary of 
Theorem 1.4.5. 

Conversely, let gcd (a, n) = 1. Then there exist integers w and n such that au + 
nv = 1 which implies aw = 1 (modw), with the result (a)(u) = (1). So, (a) is a unit 
in Z n . □ 

The immediate consequence of the above lemma is the following theorem, which 
provides a nice connection between number theory and ring theory. 

Theorem 10.3.3 For n > 1, the number of units of the ring Z n is precisely <p(n). 

Proof The proof follows directly from Lemma 10.3.1. □ 

Now we shall study some of the basic properties of the Euler phi-function. We 
start by showing that the Euler phi-function is a multiplicative function. To prove 
this we need the following lemmas. 

Lemma 10.3.2 Let n and m be two positive integers such that gcd (n, m) — 1. Then 
((a), (b)) is a generator ofZ n x Z m if and only if (a) is a generator ofZ n and ( b ) 
is a generator of Z m . 

Proof Let ((a), ( b )) £ Z n x Z m be a generator of Z n x Z m . Let (/) £ Z n and 
(g) £ Z m . Then ((/), (g)) £ Z n x Z m . Hence there exists an integer v such that 
((/), (g)) = x((a), ( b )). This implies that (/) = x(a) and (g) = x(b), implying 
that Z n = ((a)) and Z m = ((b)), proving that (a) is a generator of Z n and (b) is a 
generator of Z m . 

Conversely, let (a) be a generator of Z n and (b) be a generator of Z m . The or- 
der of (a) and (b) are, respectively, w and m. Now nm((a ), (/?)) = ((0)z n , (0)z m ) = 
0z n xZ m - Further let A.((a), (/?)) = ((0)z w , (0)z m ). This implies w divides X and m di- 
vides X. Asm and w are co-prime, nm divides X, implying that the order of (( a ), (b)) 
is mn, proving that (( a ), (b)) is a generator ofZ n xZ m . □ 

The following lemma again provides a nice connection between number theory 
and cyclic group. 

Lemma 10.3.3 The number of generators of a finite cyclic group of order n,n >2, 
is <p(n). 

Proof Let G be a finite cyclic group of order n. Then G ~ Z n . Let (g) be a generator 
of Z*. As (1) £ Z n , (1) = (m)(g), for some integer m £ Z. Thus there exists some 
integer ^ such that 1 — mg = kn , i.e., kn + mg = 1. Hence by the Corollary of The- 
orem 1.4.5, we have gcd(w, g) = 1. Thus the number of generators of Z n is <fi(n). □ 

Now we are in a position to prove the following theorem. 
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Theorem 10.3.4 Euler phi-function is a multiplicative function. 

Proof Let n and m be two positive integers such that gcd(n, m) = 1 . Let us consider 
the four groups Z n , Z m , Z nm and Z n x Z m . As (n,m) = l, Z n x Z m is a cyclic 
group which is isomorphic to Z nm . By Lemma 10.3.3, it follows that the number of 
generators of Z w , Z m and Z nm are, respectively, fin), fim) and finm). Now by 
Lemma 10.3.2, it follows that the number of generators of Z n x Z m is fin) • fim). 
As Z nm and Z n x Z m are isomorphic groups, they must have the same number 
of generators, proving the fact that finm) = fin) • fim). Consequently, 0 is a 
multiplicative function. □ 

Now we shall show through the following theorems how to evaluate fin) for a 
given positive integer n , provided the factorization of n is known. 

Theorem 10.3.5 Let p he any prime integer. Then for any positive integer n , 
fip n ) = p n — p n ~^ . 


Proof The only positive integers not exceeding p n and not relatively prime to p n 
are kp, where k is an integer such that 1 < k < p n ~ l . Thus the number of positive 
integers not exceeding p n and relatively prime to p n is p n — p n ~ l . □ 

The next theorem is for general n whose prime factorization is known. 

Theorem 10.3.6 Let n = p°f p • • • Pr r he the prime factorization ofn , where pi ’s 
are distinct primes. Then fin) = nil — ^-)(1 — 3-) •••(! — -j-). 

Proof The result follows from Theorems 10.3.4, 10.3.1, and 10.3.5. □ 


10.3.2 Sum of Divisor Functions and Number of Divisor 
Functions 

Beside the Euler phi-function, there are two more important multiplicative functions 
in number theory, namely the sum of divisor function and number of divisor function 
as defined below. 

Definition 10.3.4 For a positive integer n , the sum of divisor function, denoted by 
cr, is defined by setting a in) to be equal to the sum of all positive divisors of n. 

Example 10.3.2 For n = 6, cr(6) = 1 + 2 + 3 + 6 = 12. 

Definition 10.3.5 For a positive integer n , the number of divisor function, denoted 
by r, is defined by setting zin) to be equal to the number of all positive divisors 
of n. 
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Example 10.3.3 For n = 6, r(6) = 4. 

We now show that both a and r are multiplicative functions. 
Theorem 10.3.7 o and t are multiplicative functions. 


Proof Note that f(n) = n and g(n) = 1 are multiplicative functions. The rest of 
the theorem follows from Theorem 10.3.2 and the fact that cr(n) = J2d\ n fid) and 

nn) = Ed\ n 8(d). □ 

Theorem 10.3.8 Let p be a prime integer and n be a positive integer. Then 
cr( p n ) = P p f\ ] and z(p n ) = n + 1. 


Proof The divisors of p n are 1 , p, p 2 , p 3 , . . . , p n . Thus p n has exactly n + 1 pos- 
itive divisors, with the result r (p n ) = n + 1 and cr in) = 1 + p + p 2 4 + p n = 



The above result may be further generalized for any positive integer n. 


Theorem 10.3.9 Let n 


a\ 012 

Pi Pi ' 


p/ be the prime factorization of n, where 


“ 1+1 -i p7 +x - i 


Pi ’s are distinct primes. Then a in) = Pl pi _\ Pl p2 - 1 
{ot\ + l)(of2 + 1) • • • (oir + !)• 


Pf +l - 

Pr~ 1 


and T{n) = 


Proof The result follows from Theorems 10.3.7, 10.3.1, and 10.3.8. □ 


10.4 Group of Units 

The group of units play an important role in number theory and it has tremendous 
applications in the field of cryptology and coding theory. While studying ring the- 
ory, we have seen that the ring (Z n , +, •) contains zero divisors if and only if n is 
composite. For example, Z 6 contains (2) and (3) such that (2) (3) = (3) (2) = (0). 
This fact implies that Z n is not a field under usual addition and multiplication mod- 
ulo n if n is composite. Therefore, ( Z n \ {(0)}, •) may not be a group in general. 
However, (Z n , +, •) forms a commutative ring. Now we may look at the units of Z n 
and may try to give some algebraic structures on the set of all units of Z n . 

Definition 10.4.1 An equivalence class (a) in Z n , n > 2, is called a unit in Z w , iff 
there exists an equivalence class ( b ) in Z n such that ( a)(b ) = ( b)(a ) = (1). 

Definition 10.4.2 The collection of all units in Z n , n > 2, is called the set of units 
of Z n , and is denoted by Z* . 


10.4 Group of Units 


425 


We are now trying to provide some algebraic structures on Z* . The subsequent 
discussions will reveal that Z* has indeed some rich algebraic structures. The fol- 
lowing theorem shows that Z* forms a group, known as group of units, under the 
usual multiplication modulo ft. 

Theorem 10.4.1 For each integer n > 2, the set Z* forms a commutative group. 

Proof As (1) e Z*, Z* ^ 0. So, the set Z* is a not empty subset of the commu- 
tative ring (Z w , +, •)• Let (a), ( b ) e Z*. Then by definition of Z*, there exist ( u ), 
(v) e Z* such that (a)(u) = (1) and (Z?)(u) = (1), with the result ( ab)(uv ) = (1). 
Associativity and the commutativity of Z* follows from the hereditary property of 
(Z n , +, •)• Finally, for each ( a ) e Z*, there exists ( b ) e Z* such that (a)(b) = (1) 
follows from the definition of Z*. Combining all the arguments, it follows that Z* 
forms a commutative group. □ 

The following proposition provides the order of the group Z* . 

Proposition 10.4.1 |Z* | = 0(ft), where |Z* | denotes the order of the group Z*. 

Proof Follows from Lemma 10.3.1. □ 

Using the properties of the group (Z*, •), we deduce the following important 
theorems of number theory. 

Theorem 10.4.2 (Fermat’s Little Theorem) Let p be a prime and a be an integer 
such that gcd (a, p) = 1. Then a p ~ l = 1 (mod p). 

Proof Consider the group Z* with |Z* | = p — 1. Let (a) e Z* . Then by the def- 
inition of Z* , gcd(a, /?) = 1. Let the order of ( a ) in Z* be n. Then, (, a) n = (1) 
and the order of the subgroup generated by (a) is n. Thus by Lagrange’s The- 
orem, n\(p — 1). Then there exists some integer k such that p — 1 = kn. Thus 
( a) p ~ l = ( a) kn = (( a) n ) k = (1), with the result a p ~ x = 1 (mod p). □ 

Corollary Let p be a prime and a be any integer. Then a p = a (mod p). 

Proof We divide the proof into two cases. In the first case, let us assume that 
gcd(a, p) = 1. Then the result, follows from Theorem 10.4.2. For the second case, 
let us assume that gcd (a, p) / 1 . In that case, p must divide a. Thus <2 = 0 (mod p ), 
with the result a p = 0 = a (mod p). □ 

Theorem 10.4.3 (Euler’s Theorem) Let n be a positive integer and a be an integer 
such that gcd(< 2 , n) = 1 . Then a^^ = 1 (mod ft). 

Proof Let us consider the group Z*. Then by Proposition 10.4.1, the order of the 
group is (p (ft) . As gcd(ft,ft) = 1, (a) £ Z*. Let the order of the element (a) in 
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Z* be k. Then (a) k = (1) and the order of the subgroup generated by ( a ) is also k. 
Thus by Lagrange’s Theorem, k divides fin ). Let 0(A) = kt for some integer t. 
Then (a)^^ = (a) kt = (( a) k f = (1) implies that a = 1 (mod n). □ 

Theorem 10.4.4 (Wilson’s Theorem) For any prime integer p , ip — 1)! = 
— 1 (mod p). 

Proof The non-zero elements of Z p i.e., the elements of Z* form a multiplicative 
group of order p — 1 and the self inverse elements of Z* are only (1) and (— 1) = 
ip — 1). The other non-null elements of Z* can be paired by inverse elements so that 
the product of each pair is the identity (1). Consequently, the product of all the non- 
zero elements is (1)(2) ...ip — 1) = (—1) in Z* . This shows that ([p — 1]!) = (—1) 
and hence (p — 1)! = — 1 (mod p). □ 

Definition 10.4.3 If Z* is cyclic, then any generator (g) of Z* is called a primitive 
root for Z* . 

Example 10.4.1 (2) is a primitive root for Z|, whereas there are no primitive roots 
for Zg as Zg is not cyclic. 

From the above example, it is clear that primitive root may or may not exist. Even 
if it exits, finding primitive roots in Z* is a non-trivial problem. Till date, no efficient 
algorithm is known which can efficiently find primitive roots. One of the obvious 
but tedious methods is to try each of the fin) units (w) e Z* and check the order of 
(w) in Z*. If we find some element of order fin), we say that the element must be a 
primitive root. But a slightly better result for a more efficient test for primitive roots 
is as follows: 

Proposition 10.4.2 An element (g) is a primitive root for Z* if and only if 
(g)4>in)/p jtz (1) in Z l for each prime p dividing fin). 

Proof Let (g) be a primitive root for Z*. Then the order of (g) is fin). The proof 
follows from the definition of the order of (g), i.e., ig) 1 / (1) for all i such that 
1 < i < fin). 

Conversely, let (g)^A )/ p ^ (i) [ n z* f or eac h pri me p dividing f in) . If possible, 
let (g) not be a primitive root for Z* . Then its order, say, m must be a proper factor of 
fin), with the result fin)/m > 1. If p is any prime factor of fin)/ m, then m must 
divide fin)/p, with the result (g)^A)/p = (1) in Z*, contradicting the hypothesis. 
Thus (g) must be a primitive root for Z* . □ 

Next we shall prove a very important theorem, which plays an important role in 
cryptography. We know that (Z p , +, •) forms a field, if p is a prime number. As a 
result, iZ p \ {(0)}, *) = Z* is a commutative group. We now prove that (Z* , •) is not 
only a commutative group, but a cyclic group of order p — 1 . In fact, we shall prove 
a much general result for finite fields. To prove that first prove the following lemma. 
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Lemma 10.4.1 Let G be a commutative group and a, b e G be such that o{a) = m, 
o(b) = n and gcd(m, n) = 1. Then o{ab) = mn. 

Proof Let o(ab) =k. As (ab) mn = £g, the identity element of G, & divides mn. On 
the other hand, as G is a commutative group ( ab) k = ec implies a kn = b ~ kn , with 
the result 0* 71 = ^g- Thus m divides kn. As gcd(m, n) = 1, m divides kn implies 
m divides F Similarly, it can be shown that n divides k, with the result that mn 
divides k. Hence we have mn = k , with the result o{ab) = mn. □ 

Now we shall prove the following lemma for finite fields. 

Lemma 10.4.2 For any finite field F q having q elements , the multiplicative group 
( F q \ {0} = F*, •) is a cyclic group of order q — 1. 

Proof If q = 2, the result is trivial. So we assume that q > 3. Then | F* | = </ — 1 =h, 
say. Let h = p® 1 p “ 2 • • • /?^ m be the prime factorization of h. Now every polynomial 
x h l pi — 1 e F q [x] has at most h/pt roots in F q . Since h/pi <h, there are non-zero 
elements in F q which are not roots of x h / pi — 1. Let at be such an element. So, 

af/ pi ytz 1 , the identity element of the field F q . Further, let bi = a { Pi . Then b Pi = 

at - 1 

a k = 1 and b P f = a^ Pl 1. Hence by Lemma 10.10.10 (used in the setting 
of multiplicative group), oilof) = p ® 1 . Now by applying the result of Lemma 10.4.1 
inductively, we get o(b\b 2 • • • b m ) = o(b\)o(b 2 ) • • • o(b m ) = p\ x p% 2 • • • Pm n =q-l. 
Thus F* contains an element b\b 2 -"b m of order q — 1 which is same as the order 
of the group F*, with the result that F* is a cyclic group. □ 

Theorem 10.4.5 Z* is cyclic if p is a prime integer. 

Proof The theorem follows directly from Lemma 10.4.2. □ 

Remark For any positive integer n > 2, the group Z* may not always be cyclic. The 
cyclic property of the group Z* is characterized in Theorem 10.4.8. 

We now deal with the case for Z* in which n is an odd prime power. We shall 
prove that if p is an odd prime, Z* e is a cyclic group. Before proving the main result, 
we shall prove the following lemmas. 

Lemma 10.4.3 Let (g) be a primitive root for Z* , where p is an odd prime integer. 
Then (g) or (g + p) is a primitive root for Z* 2 . 

Proof Theorem 10.4.5 ensures the existence of a primitive root, say (g), for Z* . 
Thus g p ~ l = 1 (mod/?), but g l =£ 1 mod (/?), for 1 </</? — 1. Note that 
(g) g Z* 2 as gcd(g, p 2 ) = 1. Let the order of (g) in Z* 2 be d. Then d divides 

f(p 2 ) = p(p — 1). Again, from the definition of the order of (g), it follows that 
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g d = 1 (mod p 2 ) . So g d = 1 (mod /?) . Also, (g) has order p — 1 in Z* . Thus 
/? — 1 divides d. These two facts imply that either d = p(p — 1) or d = p — 1. 
If d = p(p — 1), then (g) is a primitive root for Z* 2 and hence we are done in that 
case. So assume that d = p — 1. Let h = g + p. Since h = g (mod p ), (/z) is also a 
primitive root for Z* . Thus arguing as before we see that ( h ) has order p(p — 1) or 
p — 1 in Z* 2 . Since g^ = g p ~ l = 1 mod(/? 2 ) and Z* 2 is a commutative group, it 
follows that 

h p ~ x = {g + pY~ X = g p - 1 + (p~ l)g p - 2 p + • • • + p p ~ l = 1 - pg p ~ 2 (mod p 2 ), 

where the dots represent terms divisible by p 2 . Since g is relatively prime to p, we 
have pg p ~ 2 # 0 mod(/? 2 ) and hence h p ~^ ^ 1 mod(/? 2 ). Thus (/z) is not of order 
p — 1 in Z* 2 , so it must be of order p(p — 1) and is therefore a primitive root for 

Z* 2 . Thus there exists a primitive root modulo p 2 . □ 

Lemma 10.4.4 Let ( h ) a primitive root for Z* 2 /or an odd prime p. Then ( h ) is 

also a primitive root for Z * pe ,for all integers e >2. 

Proof We shall prove the result by the principle of mathematical induction on e. 
The result is trivially true for e = 2. Let ( h ) be a primitive root modulo p e for 
some e > 2 and d be the order of ( h ) modulo p e + l . An argument similar to that 
as described in Lemma 10.4.3 shows that d divides 0(// +1 ) = p e (p — 1) and is 
divisible by f(p e ) = p e ~ l (p — 1). Thus either d = p e (p — 1) or d = p e ~ l (p — 1). 
In the first case, ( h ) is a primitive root modulo p e+l , as required. Thus it is suf- 
ficient to eliminate the second case by showing that h pS ^ p ~ 1>} ^ 1 (mod p e+l ). 
Since ( h ) is a primitive root modulo p e , it has order <fi(p e ) = p e ~ 1 (p — 1) 
in Z* e . Thus h p€ ^ 1 (mod p e ). However, p e ~ 2 (p — 1) = <p(p e ~ l ), so, 

pp e 2 {p- 1) = i ( m od p e ~ l ) by Euler’s Theorem. Combining these two results, we 
see that h p€ 0?-!) — l + kp e ~ l , where k is relatively prime to p. Again as before, 

h pe ~ 1 ^ = (l +kp e ~ x Y 

= 1 + (f) kp e ~ l + Q (kp e ~ 1 ) 2 + • • • + ( kp e ~ l ) p 

= 1 + kp e + l -k 2 p 2e -\p - 1) + • • • + (, kp e ~ l ) p , 

where the dots represent terms divisible by ( p e ~ 1 ) 3 and hence by p e+l , since 3(e — 
1) > (e + 1) for e > 2. Thus 

h ?e l ( p - 1>} = 1 + kp e + ^k 2 p 2e ~ l (p — 1) (mod p e+l ). 

Now as p is odd, the third term k 2 p 2e ~ l (p — l)/2 is also divisible by p e+x , since 
2e — 1 > e + 1 for e > 2. Thus = 1 + (mod p 6+1 ). 
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Since p does not divide k , we must have h p€ ^ # 1 (mod p e+l ), as required. 
Thus the lemma follows from the principle of mathematical induction. □ 

Theorem 10.4.6 For all e > 0, Z* e is cyclic if p is an odd prime integer. 

Proof The case with e = 1 follows from Theorem 10.4.5. The rest of the theorem 
follows from Lemmas 10.4.3 and 10.4.4. □ 

Remark Note that we need p to be odd, because if p = 2, then the third term 
k 2 p 2e ~ l (p — l)/2 = k 2 2 2e ~ 2 is not divisible by 2 e+l when e = 2, so the first step 
of the induction argument fails. 

Theorem 10.4.7 The group is cyclic if and only if e = 1 or e = 2. 

Proof The groups Z| = {(1)} and Z^ = {(1), (3)} are cyclic, generated by (1) and 
by (3), respectively. So it is sufficient to prove that Z| g is not cyclic for e > 3. We 
shall prove that Z| e contains no element of order f(2 e ) = 2 e ~ l by showing that 

r \(> — 2 

a A =1 (mod2 e ) for all odd a. We shall prove this by the method of principle 
of mathematical induction on e. Let us first show that the relation holds for the 
base value when e — 3, that is, we need to show a 2 = 1 (mod 8) for all odd a. 
This is true, since if a — 2b + 1 then a 2 = 4b (b + 1) + 1 = 1 (mod 8). Now we 
assume that for some exponent k > 3, a 2 =1 (mod 2 k ) for all odd a. Then for 
each odd a we have a 2 = 1 + 2 k t for some integer t. Squaring both sides we get 
a 2( * + D-2 = (l+2 k t) 2 = l+2 (k+l) t + 2 2k t 2 = l+2 k+1 (t + 2 k ~ l t 2 )= 1 (mod2* +1 ), 
proving that the result is true for k + 1. Hence by the principle of mathematical 
induction, the result is true for all integers e > 3. This completes the proof. □ 

We are going to prove the following lemma that will be useful in characterizing 
all cyclic groups of the form Z* . 

Lemma 10.4.5 If n = ah where a and b are relatively prime and are both greater 
than 2, then Z* is not cyclic. 

Proof Since gcd(a, b) = 1, we have f(n) = 4>(a)(p (/?). Moreover, as both a and b 
are greater than 2, both f(a) and f(b) are even, with the result that <p(n) to be an 
integer divisible by 4. Further, the integer e = 4>(n)/ 2 is divisible by both <fi(a) and 
f(b). Note that if (x) is a unit in Z w , then (x) is also a unit in Z a as well as in Z&. 
Thus = 1 (mod a) and x^^ = 1 (mod/?). Since both cp(a) and 0(Z?) divide e , 
we therefore have x e = 1 mod (a) and x e = 1 mod (b). Since a and b are co-prime, 
this implies that x e = 1 mod (ab), that is, x e = 1 mod (n). Thus every elements of 
Z* has order dividing e, and since e < (pin), this means that there is no primitive 
root Z* . □ 

Now we are in a position to provide a necessary and sufficient condition for Z* 
to be cyclic. 
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Theorem 10.4.8 The group Z* is cyclic if and only if 

n = 2, 4, p e or2p e , where p is an odd prime integer. 

Proof (4=) Clearly Z| and Z^ are cyclic groups generated by (1) and (3), respec- 
tively. Further, Theorem 10.4.6 ensures the case with odd prime powers. So, we may 
assume that n = 2p e , where p is an odd prime integer. Now by Theorem 10.3.4, we 
have <fi(n) = 0(2)0(j/) = <fi(p e ). Also Theorem 10.4.6 ensures the existence of a 
primitive root (g) in Z* e . Then (g + p e ) is also a primitive root modulo p e and 
one of the g or g + p e must be odd, resulting in the existence of an odd primitive 
root, say ( h ) modulo p e . We will show that ( h ) is also a primitive root modulo 2p e . 
By its construction, h is co-prime to both 2 and p e , so ( h ) is a unit in 7j2 p « i.e., 
(h) e Z 2 pe . lfh l = l (mod2 p e ), then certainly h 1 = 1 (mod p e ). Again, since (/i) is 
a primitive root modulo p e , 0(//) divides i. Since f(p e ) = 0(2//), it follows that 
0(2//) divides i, so (/*) has order 0(2//) in Z^ e and is therefore a primitive root, 
with the result that Z^ e is a cyclic group. 

(=>►) Let us now prove the converse part. If n ^ 2, 4, // or 2 //, where p is an 
odd prime integer, then we have either 

(a) n — 2 e where e > 3, or 

(b) n — 2 e pf where e > 2, / > 1 and /? is odd prime, or 

(c) n is divisible by at least two odd primes. 

We shall show that in all the cases, Z* is not cyclic. The case (a) follows from 
Theorem 10.4.7. For the case (b), we can take a — 2 e and h — pf , while for case 
(c) we can take a to be of the form p e for some odd prime p dividing n such that 
p e divides n but p e+l does not divide n , and b = n/ a. In either case, n = ab where 
a and b are co-prime and greater than 2. Thus Lemma 10.4.5 ensures that Z* is not 
cyclic. This completes the proof of the theorem. □ 


10.5 Quadratic Residues and Quadratic Reciprocity 

Let p be an odd prime and a an integer such that gcd (a, p) = 1, i.e., a and p are 
relatively prime. In this section, we shall deal with the question: whether a is a 
perfect square modulo pi Towards finding the answer, let us start with the following 
definition. 

Definition 10.5.1 If n is a positive integer, we say that the integer a is a quadratic 
residue modulo n iff gcd (a, n) = 1 and the congruence x 2 = a (mod ft) has a solu- 
tion in the set of integers. If the congruence x 2 = a (mod ft) has no solution in the 
set of integers, we say that ft is a quadratic non-residue of ft. 

Remark 1 While defining an integer ft to be a quadratic residue modulo a positive 
integer ft, instead of considering a to be an integer, we may very well consider it 
to be an equivalence class (ft) e Z*. Thus the Definition 10.5.1 my be restated as 
follows: 
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If n is a positive integer, we say that the integer a is a quadratic residue modulo n 
iff gcd(a, n) = 1 and there exists an equivalence class (x) e Z* such that (a) = (x) 2 . 

Remark 2 The above definition can be extended for any group. Given a group G, an 
element a e G is a quadratic residue if there exists an element x e G such that 
x 2 = a. In this case, we call x as a square-root of a. An element that is not a 
quadratic residue is called a quadratic non-residue. 

Now we are going to explore some algebraic properties of the set of quadratic 
residues. 

Proposition 10.5.1 In an abelian group G, the set of all quadratic residues , de- 
noted by Q1Z, forms a subgroup of G. 

Proof The identity element of G is always a quadratic residue. Thus QTZ 0. Let 
x, y e QTZ. Then there exist some z,w e G such that x = z 2 and y = w 2 . This 
implies, xy _1 = (ztu -1 ) 2 , with the result xy _1 e Q1Z. As, QTZ c G, it follows that 
QTZ is a subgroup of G. □ 

Example 10.5.1 Let us consider n — 11. We are going to find all integers a modulo 
11 such that x 2 = a (mod 11) has a solution in Z. To find all such a’s let us start 
from other direction by calculating the following: l 2 = 10 2 = 1 (mod 1 1), 2 2 = 9 2 = 
4 (mod 11), 3 2 = 8 2 = 9 (mod 11), 4 2 = 7 2 = 5 (mod 11), 5 2 = 6 2 = 3 (mod 11). 
These calculations imply that in the group Z\ v the conjugacy classes (1), (4), (9), 
(5), (3) modulo 11 are the only elements which are the quadratic residues modulo 
11 while (2), (6), (7), (8), (10) are the only elements in Z* Xi which are the quadratic 
non-residues modulo 11. Also note that the set of all quadratic residues modulo 11, 
denoted by QIZn, forms a subgroup of Z* Xi . 

The above example demonstrates that not all elements of Z\ x are quadratic 
residues modulo 11. Also, the congruence x 2 = a (mod 11) has either no solution 
or exactly two incongruent solutions modulo 1 1 . This observation leads to the fol- 
lowing theorem. 

Theorem 10.5.1 Let p be an odd prime and a be an integer such that gcd (a , p) = 1 . 
Then the congruence x 2 = a (mod p) has either no solution or exactly two incon- 
gruent solutions modulo p. 

Proof Let the congruence x 2 = a (mod p) have a solution in Z, say y\. Then 
y 2 = a (mod p). This implies (— yi) 2 = a (mod p) which implies (— yi) is also 
a solution of the congruence x 2 = a (mod p). Now we shall show that y\ and — y\ 
are incongruent modulo p. If not, then y\ = (— yi) (mod/?), i.e., p divides 2yi. 
As p is an odd prime, p does not divide 2. Thus p\y\. So, y 2 = 0 (mod p). Also, 
y 2 = a (mod p) implies that a = 0 (mod/?), i.e., p\a, contradicting the fact that 
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gcd (a, p) = 1. Hence, y\ ^ —y\ (mod p ). Now we shall prove that there exist ex- 
actly two incongruent solutions modulo p. Let z be any solution of x 2 = a (mod p). 
Then z 2 = a (mod p). As y\ = a (mod p), y 2 = z 2 (mod p) which implies that 
p | (yi + z) (yi — z) which implies either p \ (y \ + z) or p \ (y\ — z), with the result that 
either z = yi (mod p) or z = —yi (mod p). Hence the congruence x 2 = a (mod p) 
has either no solution or exactly two incongruent solutions modulo p. □ 

Corollary If p is an odd prime integer ; then the equation x 2 = 1 (mod p) has ex- 
actly two incongruent solutions , namely 1 and — 1 . 

Proof For p = 2, the result is trivial as 1 = — 1 (mod 2). The rest of the corollary 
follows from the above Theorem 10.5.1. □ 

The following result demonstrates that if p is an odd prime, then there are exactly 
as many quadratic residues as quadratic non-residues in Z* . 

Proposition 10.5.2 Let p be an odd prime integer. Then exactly half the elements 
ofZ* p are quadratic residues. 

Proof Define a map sq^ : Z* -> Z* , defined by sq^Or) = x 2 , for all x £ Z* . It 
follows from Theorem 10.5.1 that sq^ is a two-to-one function for any odd prime p. 
This immediately implies that exactly half the elements of Z* are quadratic residues. 
We denote the set of quadratic residues modulo p by QIZ P , and the set of quadratic 
non-residues by QJ\flZ p . Then 


IZ* I 1 

\Qn p \ = \oun p \ = ^ = ^-. D 

A special notation, known as Legendre symbol, associated with quadratic 
residues is defined as follows. 

Definition 10.5.2 Let p be an odd prime and a an integer such that gcd (a , p) = 1 . 
Then the Legendre symbol ( ° p ) is defined by 

a \ f 1, if a is a quadratic residue modulo p, 

p ) [— 1, if a is not a quadratic residue modulo p. 

Example 10.5.2 From Example 10.5.1, we find that 

( a\_\ 1, for a = 1, 3, 4, 5, 9, 

\ll )~\-h for a = 2,6, 7, 8, 10. 

We now characterize the quadratic residues in Z* for odd prime p. We may re- 
call the fact that Z* is a cyclic group of order p — 1 (see Theorem 10.4.5). Let 
g be a generator of Z*. As p is an odd prime, Z* may be written as, Z* = 
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{g°, g l , g 2 , . . . , g P -^~\ g ^ , g £ 4 +1 , . . . , gP~ 2 }. As in Z* p , g‘ = g‘ (nodp-i)' for 

all i e Z, squaring each element in this list and reducing modulo ^7 — 1 in the expo- 
nent yields a multi-set {g°, g 2 , g 4 , ... , g p ~ 3 , g°, g 2 , g 4 , ... , g p ~ 3 }, containing the 
list of all the quadratic residues in Z* . Note that in the multi-set, each quadratic 
residue appears exactly twice. Thus Q1Z P = {g°, g 2 , g 4 , . .., g p ~ 3 }. We see that the 
quadratic residues in Z* are exactly those elements that can be written as g l with 
i e {0, . . . , p — 3} an even integer. This leads to the following proposition. 

Proposition 10.5.3 If g is a generator of Z* , then 

n . J quadratic residue , if n is even , 

^ lS a I quadratic non-residue , if n is odd. 

The above characterization leads to a simple way to compute the Legendre sym- 
bol and this tells us whether a given element x £ Z* is a quadratic residue or not by 
using the following proposition. 

Proposition 10.5.4 Let p be an odd prime and a be an integer such that 
gcd(< 2 , p) = 1. Then 

\ p - i 

I = a 2 (mod p) . 

Proof Let a be a quadratic residue modulo p. Then (^) = 1. Also, let (g) be a 
generator of Z* . Then by Proposition 10.5.3, (a) = (g ) 2t , for some integer i. Thus 
a~ 2 “ = (g 2 *)^ - = (g^ -1 )* = 1 (mod /?), by Fermat’s Little Theorem 10.4.2. So, 

p — i 

(p = 1 (mod p) = a 2 (mod /?), as claimed. 

On the other hand, if a is not a quadratic residue, then by Proposition 10.5.3, 

9'_l_1 P~ 1 9'-l-1 Pul 9' P~ 1 p— 1 

(a) = (g) , for some integer i. Thus a i = (g + ) 2 = (g ) 2 g 2 = 

(g^ -1 )^^ 2- = g^ 2 ”” (mod/?). Now, (g^ 2- ) 2 = (g)^ -1 = 1 (mod/?). Thus (g)^ 2- 
is either +(1) or —(1) by the Corollary of Theorem 10.5.1. As, (g) is a gener- 

p — i 

ator of Z*, the order of (g) is p — 1, and hence (g)^ - cannot be +(1). Thus 
a~ 2“ = — 1 (mod /?) = (“). □ 

Using the above Proposition 10.5.4, we state the following proposition. 

Proposition 10.5.5 Let p be an odd prime and a, b be two integers such that 
gcd(a, /?) = 1 and gcd (b, p) = 1. Then 

GXtMX)' 

(ii) <2 = Z? (mod /?) implies ( p ) = ( ) ; 

Cm) (^) = l.(i) = l.Cp 1 ) = (-D^- 



The following proposition is an immediate consequence of Proposition 10.5.5. 
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Proposition 10.5.6 Let Q1Z P and QJ\flZ p denote , respectively , the set of all 
quadratic residues and the set of all quadratic non-residues modulo an odd prime 
p. Let a , a' e Q1Z P and b, b f e QMTl p . Then 

(i) aa' e Q1Z P \ 

(ii) bb f e QK P \ 

(iii) ab e QAf1Z p . 

The following proposition leads an alternative proof of Proposition 10.5.2. 

Proposition 10.5.7 Let p be an odd prime and a be an integer such that 1 < a < 
p — 1 . Then 



Proof Let (g) be a generator of Z* . As (a) e Z*, there exists unique /, 1 < i < 
— 1, such that (a) = (g) 1 . Now, we claim that (^) = — 1. If not, then (^) = 1. 

p~i 

This implies that g 2 =1 (mod /?), contradicting the fact that order of g is p — 1. 
Thus (^) = —1. Now, 



The following remark is an immediate consequence of the above proposition and 
provides an alternative proof of Proposition 10.5.2. 

Remark Let p be an odd prime integer. Then there are precisely quadratic 
residues and quadratic non-residues modulo p. 

Gauss provided another elegant criterion to determine whether an integer a rela- 
tively prime to p is a quadratic residue modulo an odd prime p. 

Theorem 10.5.2 (Gauss Lemma) Let p be an odd prime and a be an integer 
such that gcd (a, p) = 1. Consider the integers a, 2a, 3a , ... , a and their least 
non-negative residues modulo p that exceeds ^ . Let /r denote the number of these 
residues modulo p that exceeds ^ . Then 
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Proof Let 7Z = {a, 2a, 3a, ... , a }. As gcd (a, p) — 1, none of these integers 
in 7 Z is congruent to 0 modulo p and no two of them are congruent to each other 
modulo p. Let r\ , r 2 , . . . , denote the least positive residues of these elements of 
7 Z that exceed ^ and let s\, S 2 , • • • , s t denote the remaining least positive residues 
of the elements of 7 Z. Note that /x + t = and 0 < p — ri < ^ for 1 < i < /x. 
So the integers /? — r\,p — r 2 , ... , p — r^, s\, S 2 , . . . , s t are all positive integers 
and less than ^ . We shall prove that these integers are all distinct. To prove 
that it is sufficient to prove that no p — ri is equal to any Sj. If possible, let for 
some i and j, p — ri = Sj, i.e., ri + Sj = p. Then there exist integers v and y 
such that ^ = ax (mod p) and sj = ay (mod p) such that 1 < x, y < Thus 
(x + y)a = ri~\-Sj=p = 0 (mod p). As gcd (a, p) = 1, p\(x + y), contradicting the 
fact that 1 < x, y < Hence p — p — r 2 , . . . , p — r^, S 2 , . . . , s t are just 

all the integers 1,2,..., in some order. Hence their product is simply (^-)!. 
Thus we have 

^ ! = (P ~ n) ■ (p - n) ■ ■ ■ (p - r^) ■ Si ■ S2 ■ ■ ■ s t 
= (-l) M ri ■r 2 ---r tl -si-S2-s t (mod/?). 



As r \ , r 2 , . . . , r M , s \ , S 2 , ■ • • , s t are least positive residues modulo p of a, 2a , ... , 
, in some order, we have 


P-1 

2 


! = (-! Ta-2a--- 



a (mod p) 


p-i 

= (-1 'fa-Tr 


P-1 

2 


! (mod p ) . 


As gcd((^ 2 ^)!, p) = 1, (V)! canceled out from both sides, resulting in 

1 = (— 1 )^a^~ (mod p). 

Multiplying both sides by (— 1)^, we have 

p — l 

a~ = (—1) M mod p. 

Now, from Proposition 10.5.4, it follows that ( a ) = (— \)^ mod p and hence ( a ) = 

(-iy*. □ 

Example 10.5.3 To check whether the congruence x 2 = 5 (mod 13) has solutions 
in the set of integers, we may use the Gauss Lemma. Here a = 5, p = 13 and the 
set 1Z = {5, 10, 15, 20, 25, 30}. Now the least positive residues modulo 13 of these 
elements of 1Z are 5, 10, 2, 7, 12, 4, respectively. Out of these elements, only 10, 7, 
and 12 exceed p/2 = 6 5. Thus here /x = 3. Hence by the Gauss Lemma, ( ) = 
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(— l)/ x = (— 1) 3 = — 1, with the result that the congruence x 2 = 5 (mod 13) has no 
solution in the set of integers. 

Using the Gauss Lemma, we now characterize all primes that have 2 as a 
quadratic residue modulo an odd prime p. 

Theorem 10.5.3 Let p be an odd prime integer. Then 



Proof To prove the theorem, it is sufficient to prove that 2 is a quadratic residue 
modulo p if and only if p = 1 (mod 8) or p = 7 (mod 8). Let us consider the set 
S = {2, 4, . . . , p — 1}. Then the number p of integers from the set S that exceeds 
p /2 is same as the number of integers that exceeds p/4 in the set {1, 2, ... , } . 

Thus n = — Lf J • 

Now as p is an odd prime, depending on the remainder modulo 8, there are the 
following four cases: 

Case 1: If p = 8k + 1, for some k > 1, then p = 4k — 2k = 2k, an even integer. 
Case 2: If p = 8£ + 3, for some k > 0, then p = 4k + 1 — 2k = 2k + 1, an odd 
integer. 

Case 3: If p = 8£ + 5, for some k > 0, then p = 4£ + 2 — 2k — 1 = 2k + 1, an odd 
integer. 

Case 4: If p = 8£ + 7, for some k > 0, then p = 4£ + 3 — 2£ — 1 = 2& + 2, an even 
integer. 

The theorem now follows from Theorem 10.5.2 (Gauss Lemma). □ 

Remark The integer 2 is a quadratic residue of all primes p = zb 1 (mod 8) and a 
quadratic non-residue of all primes p = ± 3 (mod 8). 

We are now going to prove the following lemma that facilitates a passage from 
the Gauss Lemma to the proof of one of the celebrated theorems in number theory, 
known as, Law of Quadratic Reciprocity. 

Lemma 10.5.1 Let p be an odd prime and a be an odd integer with gcd (a, p) = 1 . 
Then 

(») = ,-l)E& nn VIP\ 

where \ia/ p] denotes the greatest integer not exceeding ia/ p. 


Proof Let us consider the integers a, 2a, , [^-\a. For / = 1,2,..., each 

ia can be uniquely represented by ia — qt p + Ui where 1 <Ui < p. Thus ia/ p — 


10.5 Quadratic Residues and Quadratic Reciprocity 


437 


qi + Ui/ p. So \ia/p\ = qt. Hence, for i = 1, 2, . . . , 


la = 



+ U [ . 


( 10 . 1 ) 


As in the Gauss Lemma (Theorem 10.5.2), let us use the notations for r\ and sj , 
i.e., if the remainder w/ > p/2, then it is one of the integers r\ , 7 * 2 , . . . , r^, while if 
w; < p/2, then it is one of the integers s\, S 2 , . . . , s t . Now from (10.1), we have 


(p- D/2 

I] '« = 

i=l 


(p-l)/2 r . 

ia 

P J 


i=l 


- p+E r «-+E s «-- 


i=l 


i=l 


(10.2) 


As it was argued while proving the Gauss Lemma that the integers p — r\,p — 

T 2 , • • • , p — r^, s\, S 2 , • • • , s t are just all the integers 1,2,..., ^~y~, in some order, 
we have 

iP~ 0/2 At A li t 

E * =E^ _r ^ + E' y ' =^ _ E n + E' y/ - ( io - 3 ) 

i=l i=l i=l i=l i=l 

Subtracting (10.3) from (10.2), we get 


(P— 1)/2 

((2-1) ^ i = p 

i = 1 


'(p-l)/2 

E 


i=l 


/<2 

p 


p 


+ 2 E r '- 


i=l 


(10.4) 


As a and p are both odd, a = /? = 1 (mod 2). Thus equation (10.4) becomes 


Q?-l)/2 /(p— 1)/2 

o- E -i- E 

i = 1 \ i=l 


Z<2 

P 



(mod 2), 


resulting in 


0-l)/2 

E 

i=l 


Z(2 

./> J 


(mod 2). 


Thus from the Gauss Lemma, we have 

= (-i) M = (-i) 


\^(p— l)/2 r ia -i /Vnr»rl O'! \^(P— l)/2r ia n 

= r_iy* = r_nE=i [ 7 ] (mod2) = (_i)E/=i [7^ 


as required. □ 

We are now in a position to prove the Law of Quadratic Reciprocity, a celebrated 
theorem in number theory. Suppose p and q are distinct odd primes and we know 
whether q is a quadratic residue of p. Then the question is: whether p is a quadratic 
residue modulo ql In mid- 1700s, Euler found the answer of the question through 
examining numerical evidence, but failed to prove the result. Later, in 1785, Legen- 
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Fig. 10.1 Lattice points 
within the rectangle 


(x,xq/p) 



dre re-formulated Euler’s answer, in its modem form. Though Legendre proposed 
several proofs of the theorem, each of his proofs contained a serious gap. The first 
correct proof was proposed by Gauss, who claimed to have rediscovered this result 
when he was 18 years old. Gauss devised seven more proofs, each based on a differ- 
ent approach. The proof presented below, a variant of one of Gauss’ own arguments, 
is due to his student Ferdinand Eisenstein. 

Theorem 10.5.4 (Law of Quadratic Reciprocity) Let p and q be two distinct odd 
primes. Then 



Proof Let us consider a rectangle in the vy coordinate plane, whose vertices are 
(0, 0), (p/2, 0), ( p/2,q/2 ) and (0, q/2) as shown in Fig. 10.1. 

We consider the points, whose coordinates are integers, within this rectangle i.e., 
not including any points on the bounding lines i.e., the points on the v axis, y axis, 
and the lines x = p/2 and y — q / 2 are excluded. The points within the rectangle 
are called lattice points within the rectangle. We are now going to count the number 
of these lattice points in two different ways. As p and q are both odd integers, the 
lattice points within the rectangle consists of all points (x , y) such that 1 < v < 
and 1 < y < . Thus the number of such points is . 

Now we are going to count the number of these points in a different way. Let us 
consider the diagonal D passing through the points (0, 0) and (p/2,q/2). Then the 
equation of the diagonal D is given by y = ( q/p)x . As gcd(p, q) = 1, we claim 
that none of the lattice points within the rectangle lies on the diagonal D. If possible 
let there exist some lattice point on the diagonal, say ( a,b ), within the rectangle. 
Then, we have b = ( q/ p)a . That implies p divides a and q divides b , contradicting 
the fact that 1 < a < and 1 < b < . Hence, none of the lattice points within 

the rectangle lie on the diagonal D. As, the diagonal D divides the rectangle into 
two parts, let B denote the portion of the rectangle below the diagonal D and let U 
denote the portion above. To count the total number of points within the rectangle, it 
is sufficient to count the total number of points inside each of the portions B and U . 
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First let us count the number of lattice points within B . Let us fix any point (x , 0) 
on the v-axis such that v is an integer with 1 < x < p/2. Now if we draw a vertical 
line through the point (x, 0), then it cuts the diagonal at (x, qx/p). As, the lattice 
points within the rectangle neither lie on the v-axis nor on the diagonal, there are 
precisely [^ ] lattice points in B just above the point (v, 0) and below D. Now, if 
we range v from 1 to (p — l)/2, the total number of lattice points contained in B 
is B y similar argument, with the roles of p and q interchanged, it 

can also be shown that the total number of lattice points within U is 
Thus equating the two different types of counting, we have 


P ~ 1 g ~ 1 
2 ’ 2 


(P~ l)/2 

E 


x=l 


xq 

P 


0 ?— 1)/2 

+ E 


y=i 


yp_ 
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Now by Lemma 10.5.1, we have 



(-D 


E$ L~i 1 )/ 2 [f] 


(- 1 ) 


l)/2rX£-i 
2^X=\ L p J 


(-D 


E^ 1)/2 [f]+E 


(«-l)/ 2 r y£-| 

V = 1 L q J 


p— 1 

(- 1 ) — 


< 7-1 
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This completes the proof of the law of quadratic reciprocity. 


□ 


As a consequence, the following corollary is immediate. 

Corollary Let p and q be two distinct odd primes. Then 

(^) if p = 1 (mod 4) or q = 1 (mod 4) 

-(p) tf P = Q = 3 (mod 4). 

Remark The law of quadratic reciprocity is some times very useful in calculating 
(^) with large prime p and comparatively small a , where the factorization of a 
is known. For example, if a = p® 1 p® 2 • • • p™ k is the prime factorization of a , then 
(p)= nf=i ( P p ) ai • N° w each of ( ^ ) may be calculated as ) = ( p (m ° d Pi) ), 

if p = 1 (mod 4) or pi = 1 (mod 4) and (^ z ) = — ( 77 ( m ^ d , if p = p t = 3 
(mod 4). So instead of calculating in modulo p , we can calculate in modulo the 
smaller prime pi. 



We are now going to define Jacobi Symbol, named after the German mathemati- 
cian Carl Jacobi (1804-1851) who introduced the symbol which is a generalization 
of the Legendre symbol. Jacobi symbol plays an important role in primality testing 
to be discussed latter. 


Definition 10.5.3 Let n be an odd positive integer with prime factorization n = 
P°\ P°i ' ' ' Pk k an d l et a be an integer such that gcd(a, n) = 1. Then, the Jacobi 
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symbol ( a ) is defined by 



where the symbols on the right hand side of the equality are the Legendre symbols. 

Proposition 10.5.8 Let n be an odd positive integer and a be an integer such that 
gcd(a, n) = 1. If the congruence x 2 = a (mod n) has a solution in Z, then (^ ) = 1. 

Proof Let p be any prime factor of n. As the congruence x 2 = a (mod n) has a so- 
lution in Z, the congruence x 2 = a (mod p) has also a solution in Z. Thus ( a ) = 1. 
Consequently, 


(:)=n(;) =i * ^n= P ^ P ^... P a k \ n 

Remark The converse of the above proposition may not be true. For example, 
( 2 5 ) = ( 2 ) • ( 2 ) = (—1) • (—1) = 1. However, there are no solutions to the congru- 
ence x 2 = 2 (mod 15) in Z as the congruences x 2 = 2 (mod 3) and x 2 = 2 (mod 5) 
have no solutions in Z. 

Now we shall show that the Jacobi symbol enjoys some of the properties similar 
to those of Legendre symbol. 

Proposition 10.5.9 Let n be an odd positive integer and let a and b be integers 
such that gcd(a, n) = 1 and gcd(b, n) = 1. Then 

(i) if a = b (modn) then (“) = (*); 

<“> (t) = C)-C); 

(iii) (- 1 ) = (-l)(«-D/2. 

(iv) (f) = (-l)(» 2 -D/8. 

Proof Let n = p® 1 p® 2 • • • p® k be the prime factorization of n. Now we prove the 
four parts one by one. 

Proof of (i). Let p be any prime factor of n. As a = b (mod ft), a = b (mod p). 
Thus by Proposition 10.5. 5(ii), it follows that (“ ) = (^). Now 



arar-ar 


b 


ft 
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Proof of (ii). The result follows directly from Proposition 10.5.5(i). 

Proof of (iii). Proposition 10.5.5(iii) asserts that for a prime factor p of n, ( ) = 

(_l)(p-!)/2 Consequently, 



_ (_l)ai(Pl-l)/2+a 2 (P2-l)/2+-+ajkte-l)/2^ (10.5) 

Now from the prime factorization of n, we have 

n = ( l + (pi- 1))“‘ • (1 + (P 2 - 1 ) P • • • (1 + (Pk~ 1))“* • 

As each pi is odd, we have 

(1 + ( Pi - 1))“‘ = (1 + ai ( Pi - D) (mod 4) 
and 

(1 +<Xi(Pi - 1)) • (1 +<Xj(Pj - 1)) 

= (1 +0Ci(pi - 1 ) + <Xj(pj - 1)) (mod 4), for i ± j. 

Therefore, 

n = l+a\(p\- 1)+U2(P2 ~ 1) H \-<*k(Pk ~ 1) (mod 4). 

Consequently, 

n ~Y~ = (<*i(Pi - l)/2 + a 2 (p 2 ~ l)/2 H h ot k (Pk ~ l)/2) (mod 2). (10.6) 

Thus combining (10.5) and (10.6), we get ( “ 1 ) = (— l)( w_1 )/ 2 , as required. 

Proof of (iv). From Theorem 10.5.3, we have (^) = (— l)^ 2-1 ^/ 8 , for any odd 
prime p. Hence, 



_ ^_^o'i(pJ-l)/8+o' 2 (pf-l)/8+--+o'^(pf-l)/8 


(10.7) 


Now from the prime factorization of n, we have 

n 2 =( i + (/>? - or • (i + (pi - or • • • (i + ( pi - 1 r- 

As pj —1=0 (mod 8), we have 

(l + (pi ~ 1))“' = 1+0!, (pj - 1) (mod 64) 
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and 


(1 + a i(pf ~ l)) ' (l + a j(Pj ~ l)) 

= 1 + a,(pf - l) +aj(pj - l) (mod64), for/ / j. 

Therefore, 

n 2 = 1 + ai(/?j - l) + ai ( p\ - l) H 1 -ak(Pk ~ 4 (mod 64). 

Consequently, 
n 2 - 1 

— - — =a\(p\ - l)/8 + ui(p\ - l)/8 4 1 -aic(pl - 4/ 8 ( mod8 )- (10.8) 

O 

Thus combining (10.7) and (10.8), we have („) = (— l)^ 2-1 )/ 8 , as required. □ 

Theorem 10.5.5 Let n and m be two odd positive integers such that gcd(n, m) = 1 . 
Then 



Proof Let n = p® 1 p® 2 • • • p% k and m = q^ 1 qlj 2 • • • q^ 1 be the prime factorizations of 
n and m, respectively. Now 


1_ / „ \Pj _Ll_ / 

Pi 


1 / \ Pi 1 K 

-n ; =nn 

j= i V7/ j=U=l 


4j 


and 


Hence, 


(:)-n 

v 7 i = 1 


of/ k 1 / \0liPj 

=nn 


i = 1 7=1 


(”)(") =nn((s)40r 


Using the law of quadratic reciprocity for the primes pi and qj , we have 


n \ / m 
m ) \ n 


— J l) a d Pi 2 )Pj( 2 2 ) 
i=l 7 = 1 

= ( _ l)^=i 

/Prbr'f a r q j~ X \ 


(10.9) 
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By using the similar argument as in the Proof of (iii) in Proposition 10.5.9, we have 




i = 1 


and 

t 

Eft 

7 = 1 

Thus from (10.9), we have 


cjj — 1 \ m — 1 


(mod 2). 


n \ l m \ , „ x n-L "izii 

= (-l) 2 2 

m / \ n ' 


as required. 


□ 


10.6 Fermat Numbers 

In this section, we shall look at a special class of numbers, known as Fermat number. 
In number theory, this special class of numbers plays an important role. 

Definition 10.6.1 A Fermat number is an integer F n of the form F n = 2 2?i + 1, 
n > 0. If F n is prime, then it is called Fermat prime. 

Fermat, having great mathematical intuition, observed that Fq = 3, F\ =5, 
F 2 = 17, F 2 = 257, F\ = 65537 are all primes. He had a belief that F n is prime 
for each value of n and expressed his confidence while communicating his idea to 
another great mathematician Mersenne. In 1732, the belief of Fermat was proven to 
be wrong by Euler showing that 641 divides F$ = 4294967297. G. Bennett gave the 
following alternative elementary proof that 641 divides F 5 . 

Proposition 10.6.1 641 divides F5. 

Proof Observe that 641 = 2 7 • 5 + 1. Let a = 2 7 and b = 5. Now 1 + ab — b 4 = 
1 + b(a — b 3 ) = 1 + b( 2 7 — 5 3 ) = 1 + 3 b = 2 4 . But this implies, F5 = 2 2 ' 5 + 1 = 
2 4 (2 7 ) 4 + 1 = (1 + ab — b 4 )a 4 + 1 = (1 + ab)a 4 — a 4 b 4 + 1 = (1 + ab)a 4 + (1 + 
a 2 b 2 )( 1 — a 2 b 2 ) = (1 + ab)a 4 + (1 + ab)( 1 — ab){ 1 + a 2 b 2 ) = (1 + ab)(a 4 + (1 — 
ab)(l + a 2 b 2 )). This implies that 641 = ab + 1 divides F5. □ 

The following proposition gives a nice form of the prime factors of F n . 

Proposition 10.6.2 Every prime factor of F n ( n > 2) must be of the form l n+2 k + 1 , 
for some positive integer k. 
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Proof Let p be a prime factor of F n = 2 2 ” + 1, where n > 2. Then 2 2 " = 

— 1 (mod/?). This implies that 2 2 ' • 2 2 " = 1 (mod/7), with the result 2 2 ” +1 = 
1 (mod/7). We claim that the order of 2 (mod p) in Z* is 2 n+1 . If possible, let 
the order of 2 (mod p) in Z* be t < 2 n+1 . Then t divides 2 n+l . Thus t must be 

of the form 2 k , k — 1, 2, . . . , n. Thus 2 2 * = 1 (mod p), contradicting the fact that 
2 2 = — 1 (mod p ) . Hence the order of 2 (mod p) in Z* is 2 n+l . Thus by Lagrange’s 
Theorem, 2 n+l divides p — 1. In particular, as n = 2, 8 = 2 3 divides p — 1. So 
p must be of the form 8& + 1. Hence by Theorem 10.5.3 and Proposition 10.5.4, 
( ^ ) = 1 =2V (mod p). Thus 2 n+1 divides which implies p = k2 n+2 + 1, for 
some positive integer k. □ 

Remark Proposition 10.6.2 may be used to provide a simple alternative proof of the 
fact that 641 = 2 5+2 -5 + 1 divides F$ (Proposition 10.6.1). 

Since the number F n increases very rapidly with n, it is not very easy to check 
the primality for F n (i.e., to check whether F n is prime or not). But in 1877, Pepin 
first introduced the following primality test for F n . 

Proposition 10.6.3 Let F n denote the nth Fermat number with n >2. For the inte- 
ger k > 2, the following conditions are equivalent. 

1 . F n is prime and the Legendre symbol ( p ) = — 1 . 

2. k^~ = — 1 (mod F n ). 

Proof (1) =>► (2) Let F n be prime and ( * ) = — 1. Then by the definition of Legen- 

F n - 1 

dre symbol, k 2 = — 1 (mod,F n ). 

F n - 1 

(2) => (1) Let 1 < a < F n be such that a = k (mod F n ). Then a 2 = 

— l (mod F n ). Thus a Fn ~ l = 1 (mod F n ). As a^~ = —1 (modi^) and 2 is the 

only prime factor of F n — 1, by Proposition 10.9.3, F n is a prime integer. Finally, 
the rest of the proposition follows from Proposition 10.5.4. □ 

Remark Till date it is not known whether there exist infinitely many Fermat primes. 
Researches are going on in that direction. The Fermat number have been studied 
intensively, often with the aid of computers. Fermat primes play an important role 
in geometry as showed by Gauss in 1801 that a regular polygon with k sides can be 
constructed by ruler and compass methods if and only if k = 2 e p\ p 2 • • • p r , where 
pi , P 2 , . . • , p r are distinct Fermat primes. 

We now study some more properties of Fermat numbers. 

Proposition 10.6.4 Let F n denote the nth Fermat number. Then for all positive 
integers n , FqF\ F 2 • • • F n - \ = F n - 2. 
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Proof We prove the result by the principle of mathematical induction on n. For 
n — 1, Fo = 3 and F\ = 5. So the condition Fo = F\ — 2 holds for n = 1. 
Let us assume that the condition holds for some k > 1, i.e., FqF\F 2 • • • = 

F* - 2. Now, F 0 F 1 F 2 • • • F*_iF* = (F 0 F l F 2 ■ ■ ■ F k - X )F k = (F k - 2 )F k = (2 2 * + 
1 — 2)(2 2 * + 1) = 2 2<:+l — 1 = Fjt + i — 2, showing that the result is true for k + 1. 
Hence the result follows from the principle of mathematical induction. □ 

We are going to show another interesting property of Fermat numbers which 
leads to an alternative proof of Theorem 10.2.2. 

Proposition 10.6.5 Distinct Fermat numbers are co-prime. 

Proof Let d be the gcd of F n and F n+m , where m e N + , i.e., d = gcd(F^, F n+m ). 
Note that x = — 1 is a root of the polynomial x 1 ™ — 1. Thus x + 1 divides x 2 ™ — 1. 
Now substituting x = 2 2 ” , F n — 2 2 ” + 1 divides 2 2 ” +m — 1 = F n+m — 2, with the 
result that d divides 2. As both F n and F n+m are odd, d must be 1. □ 

Alternative Proof of Theorem 10.2.2 From the above theorem, it follows that for 
any given positive integer n , each of the Fermat numbers F t is divisible by an odd 
prime which does not divide any of the other Fermat number Fj, for i j, i, j e 
{1,2 , ... ,n}. So for any given positive integer n , there are at least n distinct odd 
primes not exceeding F n , proving that the number of primes is infinite. □ 

Remark Note that the n + 1th prime p n + 1 < F n = 2 2 ” + 1 . This inequality is slightly 
stronger than the inequality as mentioned in Corollary 1 of Theorem 10.2.2. 


10.7 Perfect Numbers and Mersenne Numbers 

Due to some mystical beliefs, the ancient Greeks were interested on a special type 
of integers, known as perfect numbers that are equal to the sum of all their proper 
positive divisors. Surprisingly, the ancient Greeks knew how to find even perfect 
numbers. 

Definition 10.7.1 A positive integer n is said to be a perfect number if a (/ n ) = In , 
i.e., the sum of all the proper positive divisors of n is equal to n itself. 

Example 10.7.1 Since o (6) = l + 2 + 3 + 6= 12 and a (28) = l + 2 + 4 + 7 + 
14 + 28 = 56, 6 and 28 are examples of perfect numbers. The next perfect number 
is 496. 

Next we shall characterize the even perfect numbers through the following theo- 


rem. 
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Theorem 10.7.1 An even integer n > 0 is a perfect number if and only if n = 
2 k ~ l (2 k — 1), where k >2 and 2 k — 1 is a prime integer. 

Proof First let n be an even perfect number. Let n = 2*m, where t,m > 1 and m 
is odd. As cr is a multiplicative function, a{n) = cr(2 r m) = cr(20cr(m) = (2 t+l — 
l)cr(m). Again as n is a perfect number, cr(n) =2 n = 2 t+l m. Thus we have 

2 t+l m = (2 m - l)or(m). (10.10) 

Since 2 t+l and 2 t+l — 1 are relatively prime, 2 t+l divides cr(m). Thus there exists 
some integer q such that 

o(m) = 2 t+l q. (10.11) 

Consequently, from (10.10), we have 

m = (2f+ l -\)q. (10.12) 

From (10.12), q divides m. We claim that q / m. If q = m, then from (10.12), we 
have 1 = 2 t+l — 1, implying t — 0, a contradiction as t > 1, with the result that 
q f^m. Now from (10.11) and (10.12), we see that 

m + q = (2 t+l — 1 )g + g = 2 f+1 g = cr(m). (10.13) 

Now we claim that q = 1. If possible, let q / 1. As q / m, there are at least three 
positive divisors of m, namely 1, q, and m, implying a{m) >1 + q -\-m > q +m = 
o(m ), by (10.13), a contradiction. Hence q — 1. Thus from (10.12) we have 

m = 2 t+1 — 1. (10.14) 

Also from (10. 13), we have a (m) = m + 1 , which implies m must be a prime integer. 
Hence from (10.14), we can conclude that n = 2 f m = 2 t {2 t+l — 1), where t > 1 and 
2 t+l — 1 is a prime integer. 

Conversely, we assume that n = 2 k ~ l (2 k — 1), where k >2 and 2 k — 1 is a prime 
integer. We have to show that cr(n) = 2 n. Since 2^ — 1 is odd, 2 k ~ l and 2^ — 1 are 
relatively prime. Thus o(n) = cr (2 k ~ 1 )cr (2 k — 1) = (1 + 2 + 2 2 + • • • + 2^ -1 )(l + 
2 k — 1), as 2 k — 1 is a prime integer. This implies that a{n) = 2(2 k ~ l (2 k — 1)) = 2 n, 
proving that n is a perfect number. □ 

Theorem 10.7.1 asserts that to get an even perfect number we need to have a 
prime number of the form 2^ — 1, where k > 2. Then the following theorem will 
help us in finding such primes. 

Theorem 10.7.2 For some positive integers a and n > 1 , ifa n — 1 is a prime integer 
then a = 2 and n is a prime integer. 
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Proof First we prove that a = 2. As a n — 1 = {a — 1 ){a n ~ l + a n ~ 2 + • • • + 1) is 
prime and a is a positive integer great than 1, a — 1 must be equal to 1, implying 
a = 2. Now we claim that n is a prime integer. If not, then let n = rs, 1 < r,s <n. 
Thus (a n — 1) = a rs — 1 = ( a r — 1 )((a r ) s ~ l + (< 2 r ) 5-2 + • • • + 1), a contradiction 

as both (a r — 1 ) and ((< a r ) s ~ l + (a r ) 5-2 H b 1 ) are positive integers greater than 

1 , with the result that n to be a prime integer. □ 

Theorem 10.7.2 asserts that for primes of the form 2 n — 1, we need to consider 
only integers n that are prime. Motivated by the study of perfect numbers, these 
special type of integers have been studied in great depth by a great French monk of 
the 17th century, Mersenne. These special type of integers are known as Mersenne 
numbers, which are defined as follows. 

Definition 10.7.2 If n is a positive integer, then M n = 2 n — 1 is called the nth 
Mersenne number. Further, if M p = 2 P — 1 is a prime integer, then M p is called the 
Mersenne prime. 

Example 10.7.2 Already at the time of Mersenne, it was known that some 
Mersenne numbers, such as, M 2 = 3, M 3 = 7, M 5 = 31, M 7 = 127 are primes, 
while Mu = 2047 = 23 x 89 and M 37 = 2 37 - 1 = 137438953471 = 223 x 
616318177 are composite. In 1640, Mersenne stated that M p is also a prime for 
p = 13, 17, 19, 31, 67, 127, 257. Unfortunately, he was wrong about 67 and 257, 
and he did not include 61, 89, 107 (in the list among those less than 257), which 
also produce Mersenne primes. However, we must appreciate that his statement was 
quite astonishing, in view of the size of the numbers involved. 

Since M n increases exponentially, it is very difficult to check the primality of M n 
for even moderate n. But the following theorem plays an important role in checking 
the primality for M n . To prove this let us first prove the following lemma. 

Lemma 10.7.1 For any two positive integers a and b , gcd(2 a — l,2 b — 1) = 

2 gcd (a,b) _ i 

Proof Let c = gcd(a, b). Then 2 C — 1 divides both 2 a — 1 and 2^ — 1. Let d divide 
both 2 a — 1 and 2 b — 1 . We show that d also divides 2 C — 1 . If possible, suppose 
d does not divide 2 C — 1 . Then there exists a prime p and a positive integer n such 
that p n divides d but p n does not divide 2 C — 1. Now p n divides d implies p n 
divides both 2^ — 1 and 2^ — 1. Note that as both 2 a — 1 and 2^ — 1 are odd, p 
must be an odd prime integer. Also as c = gcd (a, b ), there exist integers x and y 
such that c = ax + by. Further, 2 a = 1 (mod p n ) and 2 b = 1 (mod p n ) imply that 
2 C = 2 ax+by = 1 (mod p n ), with the result that p n divides 2 C — 1 , a contradiction. 
Thus d divides 2 C — 1. Hence gcd(2 a — 1, 2 b — 1) = 2 gcd( ^) — 1. This completes 
the proof. □ 

Using the above lemma, we prove the following theorem. 
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Theorem 10.7.3 If p is an odd prime , then every divisor of the pth Mersenne 
number M p = 2 P — 1 is of the form 2 kp + 1, where k is a positive integer. 

Proof To show the result, it is sufficient to prove that any prime q dividing M p is of 
the form 2 kp + 1 , for some positive integer k. By Fermat’s Little Theorem, q divides 
2 q ~ l — 1. Now by Lemma 10.7.1, we have gcd(2^ — 1, 2 q ~ l — 1) = 2 gcd( ^ -1 ) — 1. 
As q is a common divisor of 2 P — 1 and 2 q ~ l — 1, gcd(2 p — 1, 2 q ~ l — 1) > 1. We 
claim that gcd (p, q — 1) / 1. If possible, let gcd(p, q — 1) = 1. Then gcd(2 p — 
1, 2 q ~ l — 1) = 1, a contradiction. Therefore gcd(p, q — 1) = p. Hence p divides 
q — 1. Thus there exists a positive integer t such that q — l = tp. As p is odd 
and <7 — 1 is even, t must be even, say t = 2k, for some positive integer k. Hence 
q = 2 kp + 1 , as required. □ 


10.8 Analysis of the Complexity of Algorithms 

In this section, we study how to analyze an algorithm and how the computational 
complexity of an algorithm is computed. This plays an important role in the the- 
ory of computational number theory as the analysis enables us to judge whether an 
algorithm is good/efficient or not. Computational complexity theory is an impor- 
tant branch in theoretical computer science and mathematics. Complexity theory, or 
more precisely, computational complexity theory, deals with the resources required 
during some computation to solve a given problem. The main objective of the com- 
putational complexity theory is to classify the computational problems according 
to their inherent difficulties. Again in computational complexity theory, complexity 
analysis of an algorithm plays an important role. Complexity analysis is a tool that 
allows us to explain how an algorithm behaves as the input grows larger i.e., if we 
feed an algorithm a different input, the question is how will the algorithm behave? 
More specifically, if one algorithm takes 1 second to run for an input of size 1000, 
how will it behave if we double the input size? Will it run just as fast, half as fast, 
or four times slower? In practical programming, this is important as it allows us to 
predict how our algorithm will behave when the input data becomes larger. The pro- 
cess of computing involves the consumption of different resources like time taken to 
perform the computation, amount of memory used, power consumed by the system 
performing the computation, etc. To measure the amount of resources utilized by 
different algorithms, we need the following notion of asymptotic notations. 


10.8.1 Asymptotic Notation 

Here we discuss some standard notations related to the rate of growth of functions. 
This concept will be useful not only in discussing the running time of algorithms but 
also in a number of other contexts. In algorithm analysis, it is convenient to have a 
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notation for running time which is not clogged up by too much details, because often 
it is not possible and often it would even be undesirable to have an exact formula 
for the number of operations. For this, “O -notation” (read it as “big-Oh”-notation) 
is commonly used. We give the definitions and basic rules for manipulating bounds 
for any given algorithm. 

Let / and g be real- valued functions, both defined either on the set of non- 
negative integers (N), or on the set of non-negative reals R*. Here we are interested 
about the behavior of both fix) and g(x), for large x, mostly for x — > oo. For this 
reason, we only require that fix) and g(x) are defined for all sufficiently large x. 
We further assume that g(x) > 0 for all sufficiently large x. 

Definition 10.8.1 For the function g : N — > R*, let us define 

• OW«)) = {/W|/:N^R* and 3 positive real constant c and no e N such that 
0 < f(n) < cg{n ), Vft > no }. 

Remark Instead of writing /(ft) e Oigin )), we write fin) = 0(g(n)). This 
means that gin) is an asymptotic upper bound for fin). For example, if fin) = 
n 2 + In + 10000, then Vft > 10000, n 2 + In + 10000 < n 2 + n 2 + n 2 = 3 n 2 . 
Thus in this case, no = 10000 and c = 3. So, n 2 + In + 10000 = 0(n 2 ). Thus 
O -notation classifies functions by simple representative growth functions even 
if the exact values of the functions are unknown. Note that since we demand a 
certain behavior of a function only for n > no, it does not matter if the functions 
that we consider are not defined or not specified for n < no. 

• 0(g) = {/|/:N^R* and 3 positive real constants c \ , C 2 and no e N such that 
0 < cigin) < fin) < c 2 gin), Vft > no). 

Remark Instead of writing fin) e Gig in )), we write fin) = Gig in)). In- 
formally, © notation stands for equality in asymptotic sense. For example, if 
fin) = n 2 /2 — 2n, then Wn > 8, n 2 / 4 < n 2 / 2 — 2 n < n 2 /2. So here, c\ = 1/4, 
C 2 = 1/2 and no = 8. Thus fin) = Gin 2 ). Also note that 6 n 3 / Gin 2 ), as if 
6/t 3 = Gin 2 ), there exist C 2 € R + and fto G N such that Vft > fto, 6ft 3 < C 2 n 2 
which implies n < C 2 / 6 , a contradiction, proving that 6ft 3 / ©(ft 2 ). The nota- 
tion ©(1) is used either for a constant or for a constant function with respect to 
some variable. Finally note that Gig in)) c Oigin)), as /(ft) = Gig in)) implies 
fin) e Oigin)). 

. *2(g(ft)) = {/(ft)|/:N^R* and 3 positive real constant c and fto e N such that 
0 < eg in) < fin), Vft > fto}. 

Remark Instead of writing /(ft) e Qigin)), we write /(ft) = Qigin)). This 
means that gin) is an asymptotic lower bound for /(ft). For example, if /(ft) = 
ft 2 + 7ft + 10000, then Vft > 1, ft 2 < ft 2 + In + 10000. Thus in this case, fto = 1 
and c— 1. So, ft 2 + 7ft + 10000 = Qin 2 ). Note that for any two functions / and 
g, fin) = Gig in)) if and only if /(ft) = Oigin)) and /(ft) = Qigin)). 
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g(n) is an asymptotic upper bound for f(n) g(n) is an asymptotic lower bound for f(n) 



g(n) is an asymptotically tight bound for f(n) 
Fig. 10.2 Asymptotic bounds 


• o(g(n)) = {f(n)\f : N -> R* and V positive real constant c and no eN such that 
0 < f(n) < eg in), Vn > ^o}- 

Remark Instead of writing fin) e o(g(n )), we write f(n ) = o(g(n)). Here, 
the bound is not an asymptotically tight upper bound for fin). For example, 
2 n = o(n 2 ) but 2 n / o(n). The main difference between O and o is that in 
fin) = 0(g(n)), the bound 0 < fin) < eg (n) holds for some constant c e R + , 
but in f(n) =o(g(n )), the bound 0 < f(n) < cg(n) holds for all constant c e R + . 
Intuitively, in 6>-notation, the function f(n) becomes insignificant relative to g(n) 
as n approaches infinity, i.e., lim^^oo = 0. 

Note In Fig. 10.2, the above asymptotic bounds are described. 

• co(g{n)) = {f{n)\f : N — > R* and V positive real constant c and no e N such that 
0 < cg(n) < fin), Wn > no). 

Remark Instead of writing f(n) e co(g(n )), we write f(n) = co(g(n)). Here, the 
bound is not an asymptotically tight lower bound for f(n). For example, 2 n 2 = 
co(n) but 2 n 2 coin 2 ). An alternative way of defining the notation co is fin) e 
coigin)) if and only if lim^oo = oo. Finally, note that fin) e coig in)) if 
and only if gin) e oifin)). 
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10.8.2 Relational Properties Among the Asymptotic Notations 

Here we shall discuss some of the relational properties among the asymptotic nota- 
tions. 

Transitivity: / (n) = & (g (n)) and g (n) = G (h (n)) imply / (n) = G (h (n)) . 

Similar result holds for O, £2, o, co. 

Reflexivity: / (n) = & (/ (n)) . The same result holds good for O and £2 . 

But reflexivity does not hold for o and co. 

Symmetry: f(n) = G(g(n)) if and only if g(n) = @(f(n)). The same 

result does not hold for O, £2, o, co. 

Transpose Symmetry: f(n) = 0(g(n)) if and only if g(n) = C2(f(n)). Also, 
fin) = o(g(n)) if and only of g(n) = co(f(n)). 


10.8.3 Poly-time and Exponential-Time Algorithms 

In this section, we shall introduce informally some of the important concepts of 
complexity theory. Instead of using a fully abstract model of computation, such as, 
Turing machines, in this section, we consider all algorithms running on a digital 
computer with a typical instruction set, an infinite number of bits of memory and 
constant-time memory access. This model may be thought of as the random access 
machine (or register machine) model. 

A computational problem is specified by an input (of a certain form) and an 
output (satisfying certain properties relative to the input). An instance of a compu- 
tational problem is a specific input. The input size of an instance of a computational 
problem is the number of bits required to represent the instance. The output size of 
an instance of a computational problem is the number of bits necessary to repre- 
sent the output. A decision problem is a computational problem where the output is 
either “yes” or “no”. 

An algorithm to solve a computational problem is called deterministic if it does 
not make use of any randomness. We will study the asymptotic complexity of de- 
terministic algorithms by counting the number of bit operations performed by the 
algorithm expressed as a function of the input size. Upper bounds on the complexity 
are presented using “big O” notation. When giving complexity estimates using big 
O notation we implicitly assume that there is a countably infinite number of possible 
inputs to the algorithm. 

Order notation allows us to define several fundamental concepts that are used 
to get a rough bound on the computational complexity of mathematical problems. 
Suppose that we are trying to solve a certain type of mathematical problem, where 
the input to the problem is a number whose size may vary. As an example, consider 
the Integer Factorization Problem, whose input is a number N and whose output is 
a prime factor of N. We are interested in knowing how long it takes to solve the 
problem in terms of the size of the input. Typically, one measures the size of the 
input by its number of bits, i.e., how much storage it takes to record the input. 
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Definition 10.8.2 A problem is said to be solvable in polynomial time if there is 
a constant c > 0, independent of the size of the input, such that for inputs of size 
0(k ) bits, there is an algorithm to solve the problem in 0(k c ) steps. 

Remark If we take c to be 1 in Definition 10.8.2, then the problem is solvable in 
linear time, while if we take c to be 2, then we say that the problem is solvable 
in quadratic time. In general, polynomial-time algorithms are considered to be fast 
algorithms. 

Definition 10.8.3 A problem is said to be solvable in exponential time if there is a 
constant c > 0 such that for inputs of size O ( k ) bits, there is an algorithm to solve 
the problem in 0(e ck ) steps. 

Remark Exponential-time algorithms are considered to be slow algorithms. 

Remark In the theory of complexity, problems solvable in polynomial time are con- 
sidered to be “easy” while problems that require exponential time are considered to 
be “hard”. 

As we are going to analysis algorithms, we need to know what do we mean by 
an algorithm. Before that the notations related to algorithms as discussed in the next 
section. 


10.8.4 Notations for Algorithms 

The notion of an algorithm is one of the basic concepts in theoretical computer 
science. Informally, an algorithm is a sequence of well-defined computational pro- 
cedure that takes as input some value, or a set of values and produces some value, 
or set of values, as output. The notion of an algorithm has been formalized as a 
program for a particular theoretical machine model known as the Turing machine 
model in the theory of algorithms and computational complexity. Let us first start 
with a simple notion and notations related to algorithms for integers. 

• An integer variable contains an integer. Variables are denoted by typewriter type 
names like a, b,k, l, and so on. 

• An array corresponds to a sequence of similar type of variables, indexed by a 
segment of the natural numbers. For example, int a [10] denotes an array with 
10 entities of integer type, while char a[10] denotes an array with 10 entities of 
character type. An array element is given by the name with the index in brackets. 
For example, a[ 0] denotes the first entity, while a [5] denotes the sixth entity of 
this array. 

• If v is some integral value obtained by evaluating some expression and v is an 
integer type variable, then v <- x denotes an instruction that causes the value v 
to be put into the variable v. 
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• if-then statement and if-then-else statement are used with the obvious semantics. 
Let Bool-exp denote an expression that evaluates to a boolean value and stat 
denote a (simple or composite) statement. Then the statement 

if Bool-exp then stat 

is executed as follows: first the boolean expression Bool-exp is evaluated to either 
true or false. If the result is true, the (simple or composite) statement stat is 
carried out. Similarly, the statement 

if Bool-exp then statl else stat2 

is executed as follows: if the value of the Boolean expression Bool-exp is true, 
then statl is carried out, otherwise stat2 is carried out. 

• In an algorithm, it is some times required to perform a set of instructions repeat- 
edly. This involves repeating some part of the algorithm either a specified number 
of times or until a particular condition is being satisfied. This repeated operations 
may be done using loop control instructions through the while statement. The 
statement 


while Bool-exp do stat 

may be executed as follows: first the boolean expression bool-exp is evaluated. 
If the outcome is true, the body “stat” is carried out once, and we start again 
carrying out the whole while statement. Otherwise the execution of the statement 
is finished. 

• The loop body may contain a special instruction, known as break, which, when 
executed, immediately terminates the execution of the inner loop in which the 
break statement is executed. 

• The return statement is used to immediately finish the execution of the algorithm. 


10.8.5 Analysis of Few Important Number Theoretic Algorithms 

Efficiency of an algorithm can be measured in terms of execution time (time com- 
plexity) and the amount of memory required (space complexity). Now the question 
is: which measure is more important? Answer often depends on the limitations of 
the technology available at time of analysis. However, in this section, we are go- 
ing to deal only with the time complexity but not the space complexity. Note that 
time complexity analysis for an algorithm must be independent of the programming 
language and the machine used. The major objectives of time complexity analysis 
are to determine the feasibility of an algorithm by estimating an upper bound on 
the amount of work performed by the machine and to compare different algorithms 
before deciding on which one to implement. To simplify the analysis, we sometimes 
ignore the work that takes a constant amount of time. 
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In this section, as we will be analyzing some number theoretic algorithms, 
let us first look at how a non-negative integer n is written to the base b. Let 
n = Ck-\b k ~ l + Ck-ib k ~ 2 + • • • + c\b + co, where c\ s are integer between 0 and 
b — 1. Then we say that {ck-\Ck-2 • • • cico)& * s representation of n to the base b. 
If Ck- 1 is non-zero, then we call n a k- digit base b number. Note that any integer in 
the interval [b k ~ l , b k — 1] is a k-digit number to the base b. 

Theorem 10.8.1 Let n be a positive integer. Then the number of digits in the repre- 
sentation ofn to the base b is [log^ n] + 1, where [x] represents the greatest integer 
not exceeding x . 

Proof Note that any integer satisfying b k ~ l <n <b k has k digits to the base b. Then 
log^ b k ~ l < log^ n < lo g b b k implies k — 1 < log^ n <k that implies [log^ n\=k — 
1 , with the result k = [log b n] + 1 . □ 

Bit Operations While analyzing any program or algorithm to be run by a com- 
puter, we are interested in calculating the total time taken by the computer to run the 
algorithm. For example, suppose by using two different logics we write two differ- 
ent programs, say Pi and P 2 , that can check whether a given integer is prime or not. 
On the same input in a same machine, suppose P\ takes less time than P 2 to get a 
correct output. Then surely we will call Pi a better algorithm than P 2 . So, it seems 
that we can measure the goodness of an algorithm by its run time. But if we con- 
sider only the run time, we may lead to a wrong conclusion. Suppose we are running 
the program P\ in a very old machine, say 15 years old machine while running the 
program P 2 in a latest machine having latest hardware specification. It may happen 
that P 2 may take much less time than P\ to give correct answer on the same input. 
So run time of a program should not be a measure of goodness of an algorithm. 
We need something more basic that should be machine independent. One of the 
best measures is to calculate the number of bit operations required to complete the 
algorithm. The amount of time a computer takes to perform a task is proportional 
to the number of bit operations. So now on, when we speak of estimating the time 
required by a computer to perform a particular task, we mean finding an estimation 
of the number of bit operations required by it. Let us explain through the following 
example what we exactly mean by the bit operations. 

Example 10.8.1 (Addition of two k bits long binary strings) Suppose we want to 
add two binary strings, say a and b. If one of the integers has fewer bits than the 
other, we fill in zeros to the left of the integer to make both integers of the same 
length, say k. Let us explain the addition operation through the following example: 


1 

1 

1 

1 

1 


carry bits 

1 

0 

0 

1 

1 

1 

the 6 bit long binary string a 

+ 

1 

1 

1 

0 

1 

<r- the 5 bit long binary string b 

1 0 

0 

0 

1 

0 

0 

the output binary string a + b 
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Let us analyze the addition algorithm in detail. We need to repeat the following steps 
k times starting from right to left: 

1 . Look at the top and the bottom bit and also at whether there is any carry above the 
top bit. Note that for the right-most bits, there will be no carry bit. For example, 
in the above case, the right-most bits of a and b are 1 and 1, respectively, without 
any carry bit above the right-most bit of a. 

2. There will be no carry, if both bits are 0. Put down 0 and move on to the left, if 
there are more bits left. 

3. If either (i) both bits are 0 and there is a carry, or (ii) one of the bits is 0, the other 
is 1, and there is no carry, then put down 1 and move to the left, if more bits are 
left. 

4. If either (i) one of the bits is 0, the other is 1, and there is a carry, or else (ii) both 
bits are 1 and there is no carry, then put down 0 along with a carry in the next 
column, and move on to the left, if more bits are left. 

5. If both bits are 1 and there is a carry, then put down 1, put a carry in the next 
column and move on to the left, if more bits are left. 

Doing this procedure once is called a bit operation. So, the addition of two k bit 
binary strings requires 0(k) bit operations. 

Example 10.8.2 (Multiplication of a k - bit long binary string with an /-bit long 
binary string) Suppose, we want to multiply a k - bit integer n by an /-bit inte- 
ger m. Let us explain the method through an example by taking the binary strings 
n = 101 10 and m = 1011. So here k — 5 and / = 4. The multiplication may be done 
as follows: 



1 

X 

0 

1 

1 

0 

1 

1 

0 

1 

<— binary representation of n 
<r- binary representation of m 


1 

0 

1 

1 

0 

Row 1 : 

1 

0 

1 

1 

0 

0 

Row 2: 







Row 3: 

1 0 1 

1 

0 

0 

0 

0 

Row 4: 

1 1 1 

1 

0 

0 

1 

0 

Row 5: output of the multiplication 


As we are multiplying the number n by an /-bit integer m, we obtain at most / 
rows as shown above, where each row consists of a copy of n , shifted to the left a 
certain number of times, i.e., with 0’s put on at the end. Note that, corresponding to 
each occurrence of ‘O’ inm, a row is reduced. For example, in the above example, 
due to the occurrence of a single ‘O’ in m, the third row in the summation does not 
appear. Suppose there are l r rows appearing in the summation part (Row 1 to Row 4). 
Clearly, V <1. As we are interested in counting the number of bit operations, we 
cannot simultaneously add all the rows, if V > 2. Instead we shall first add the first 
two rows and then add the resulting to the third row and so on. The details of the 
addition process is as follows: 

1 . At each stage, we count the number of places to the left the number n has been 
shifted to form the new row. For example, in the above multiplication, the second 
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row of the summation part is a single shift of n to the left followed by a 0 put on 
to the end, i.e., the second row becomes 101 100. 

2. We copy down that many right-most bits of the partial sum, and then add to n the 
integer formed from the rest of the partial sum. As explained in Example 10.8.1, 
it takes 0(k) bit operations. For example, in the above sum, while adding Row 1 
and Row 2, we first count the number of places, the number n has been shifted 
to the left to form the Row 2. We see that, in Row 2, the number n has been 
shifted one position to the left. So, we cut that many bits, here only one bit, from 
the right of the first row 10110 to form a new binary string 1011 and add this 
newly formed number with the Row 2 to get s = 100001. The partial sum of the 
Row 1 and Row 2 is obtained by appending to the right of s = 100001, the last 
bit that was cut from the right of the first row, it resulting that the partial sum is 
p = 1000010. Next we are going to add this partial sum with the Row 3. Due to 
the occurrence of 0 in m, Row 3 is ignored or filled up with zeros. So, the partial 
sum of Row 1 to Row 3 is same as that of Row 1 and Row 2. So, we look at 
the Row 4 and count the number of left shifts. We find that the number of left 
shifts are 3. So, we cut three bits from the right of the partial sum p = 100010 
and get the binary string 1000. We add this newly formed string with zz = 101 10 
to get 11110. Finally we append the three bits 010 that was cut from the right 
of the partial sum p = 100010 to the right of 11110 to get the final answer as 
11110010. 

3. The example shows that the multiplication task can be broken down into V — 1 
additions, each taking k bit operations. As l' < /, the number of bit operation 
required is bounded by 0(kl). 

Remark In the above example, we only count the number of bit operations and 
neglect the time that a machine takes to shift the bits inn a few places to the left, 
or the time it takes to copy or to cut suitable number of bits of the partial sum 
corresponding to the places through which n has been shifted to the left in the new 
row. For the machines with latest specifications, the shifting, copying, and cutting 
operations are very fast compared to the large number of bit operations. As a result, 
we can safely ignore them. So, the time estimation for an arithmetic task may be 
defined as an upper bound for the number of bit operations, without including any 
consideration of shift operations, changing registers (copying), memory access etc. 


10.8.6 Euclidean and Extended Euclidean Algorithms 

Given two integers x and y, not both zero, the greatest common divisor of x and 
y, denoted by gcd(x, y) is defined to be the largest integer g dividing both x and 
y. If by some means, we know the prime factorizations of both x and y, then find- 
ing the gcd(x , y) is very easy. The gcd is simply the product of all primes which 
occur in both factorizations raised to the minimum of the two exponents. For ex- 
ample, let* = 86826881610864294564952032561107 = 53 s • 149 3 • 251 7 and >■ = 
21919689538097272917485163493 = 53 4 • 149 2 • 277 7 . Then the gcd(x, y) = 53 4 • 
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149 2 = 175176568681. Now if we deal with large numbers like v = 114555615673 
899844817675135734699353962887022183538809518439182723858423697776 
6549697736360253515936116158310277920547232914911958057425827985077 
56943658064635 2451404300630290628888781488659516752482082350959575 
53631424680802957119280684885 426825458311 having 252 digits and y = 1320 
7363278391631158808494622912915162971190701945853007595880800438217 
9191873 510656280334858074149753155301 1 1573252485051699075283080652 
5038208290693310820312 27451253664670887101093132056160100710024856 
7646755988618892358053208718732009793 5537658916537250176345966729 
71 having 270 digits, then finding gcd(x, y) using the prime factorization method 
is not an effective one as it is likely that the prime factorizations of v and y are not 
known. In fact, in the theory of numbers, an important area of research is the search 
for quicker methods of factoring large integers. In that respect, we are fortunate 
enough to have a quick way to find gcd of two integers v and y, even when we do 
not have any idea of the prime factorizations of x and y. 

In Chap. 1, we have discussed about the greatest common divisor (gcd) of two 
positive integers. In this chapter, we shall discuss how this can be efficiently com- 
puted by the Euclidean algorithm. We further analyze the efficiency of the algorithm. 
We shall show that this algorithm is very efficient in the sense that the algorithm runs 
in polynomial time in the size of inputs. For that let us first describe the algorithm. 

Euclidean Algorithm Let ro = a and r\ = b be two integers with a > b > 0. If the 
division algorithm is successively applied to obtain rj = rj+\qj+\ + r/+ 2 , where 
0 < rj + 2 < r j+ 1, for j = 0, 1, . . . , n — 2 and r n +\ = 0, then gcd (a, b) =r n , the last 
non-zero remainder. 

Let us try to illustrate the algorithm through the following example. 

Example 10.8.3 Let us try to compute gcd(55, 34) using Euclidean algorithm. The 
steps are as follows: 

55 = 

34 = 

21 = 

13 = 

8 = 

5 = 

3 = 

2 = 

Thus the gcd(55, 34) = 1. 

A pseudocode for the Euclidean algorithm is presented in Table 10.1. 


34 

• 1+21, 

21 

•1 + 13, 

13 

•1+8, 

8- 

1+5, 

5- 

1+3, 

3- 

1+2, 

2- 

1 + 1 «-r„, 

1 • 

2 + 0 r n+ 1 
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Table 10.1 Euclidean 


algorithm 


Input: Two positive integers a and b. 

Method: 



4. while ri ^ 0 


5 - d° | .._i -qtn 



6. i <-i — 1 

7. return gcd (a, b ) = r* 


Now we shall show that Euclidean algorithm always gives the greatest common 
divisor of two positive integers in a finite number of steps. For that the following 
lemma plays an important role. 

Lemma 10.8.1 If a and b are two positive integers and a = bq + r, where q and r 
are integers such that 0 < r < b, then gcd(a, b) = gcd (b, r). 

Proof Let d be a common divisor of a and b. As r = a — bq, d divides r. Thus if d 
is a common divisor of b and r, then since a = bq-\-r,d is also a divisor of a. Since 
the common divisors of a and b are the same as the common divisors of b and r, we 
get gcd(a, b) = gcd(Z?, r). □ 

Now we are going to prove the correctness of the Euclidean algorithm. 

Proof Given that ro = a and r\=b. Now by successively applying the division 
algorithm, we have 


0 < r 2 < r\ 
0 < T3 < T2 


ro = nqi +r 2 , 
n = r 2 q 2 + r 2 , 


0-2 = 0 - 14./-1 + 0 ’ Ocrj c rj-i 


r n— 3 — r n—2qn—2 — l, 0 < r n — i < r n _2 

r n-2 = r n -iq n -i +r n , 0 < r n < r n -\ 
r n - 1 = r n q n . 


Since the sequence a = ro > r\ > r 2 > r 2 > • • • > 0 is a strictly decreasing se- 
quence of non-negative integers, we eventually obtain a remainder zero. Thus by 
Lemma 10.8.1, we have gcd(a, b) = gcd(ro, r\) = gcd(r\,r 2 ) = gcd(> 2 , rf) = • • • = 
gcd(r n - 2 , r n -\) = gcd(r n -\,r n ) = gcd(r^,0) = r w , the last non-zero remainder. 


This completes the proof. 


□ 
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Remark The Euclidean algorithm may be used to determine whether a positive in- 
teger a , less than a given positive integer n , has a multiplicative inverse modulo n. 
If the gcd (a, n) = 1, we can say that the multiplicative inverse of a modulo n ex- 
ists. However, the Euclidean algorithm does not directly provide the multiplicative 
inverse modulo n, if it exists. 

Now to analyze the time complexity of the Euclidean algorithm, we need the 
following lemma. 

Lemma 10.8.2 In the above mentioned Euclidean algorithm , rj + 2 < \ r j,f or a U 

j = 0, 1, . . . , n — 2. 

Proof If rj+i < ^rj, then we have r ; - + 2 < rj+\ < ^ry. So, in this case, we are done. 
Now suppose, ry+i > \rj. Note that \rj < rj+\ and ry+i < ry, imply rj + \ < rj < 
2r ;+ i. Thus in the expression rj = ry+i^y+i + ry + 2, we must have g ;+ i = 1, i.e., 
0+2 = rj - 77+1 < r ; - -\rj = \rj . □ 

Now we are in a position to analyze the Euclidean algorithm through the follow- 
ing theorem. 

Theorem 10.8.2 The time complexity for computing the Euclidean algorithm is 
0(( log 2 a) 3 ). 

Proof Lemma 10.8.2 asserts that every two steps in the Euclidean algorithm must 
result in cutting the size of the remainder at least in half and as the remainder never 
goes below 0, it follows that there are at most 2[log 2 a] many above steps. Again, as 
each step involves division and each division involves numbers not larger than a , it 
takes altogether 0((log 2 ^) 2 ) bit operations. Thus total time required is 0(log 2 a) • 
0((log 2 a) 2 ) = 0((log 2 a) 3 ). □ 

Remark A more careful analysis of the number of bit operations required for the 
computation of Euclidean algorithm, taking into account of the decreasing size of 
the numbers in the successive divisions, can improve the time complexity of the 
Euclidean algorithm to 0((\og 2 a) 2 ) bit operations. 

As an extension of the Euclidean algorithm, we state the following algorithm 
known as Extended Euclidean Algorithm. In Chap. 1, we have already seen that if d 
is the greatest common divisor of a and b , then there exist integers u and v such that 
d = au-\-bv. The Extended Euclidean algorithm actually provides a method to find 
such u and v. To understand the Extended Euclidean algorithm, let us first define 
two sequences of integers uo,u\,U 2 , . . . ,u n and no, v\, V 2 , . . . , v n according to the 
following recurrences in which the qi ’s are defined as in Table 10.1: 

0 if i = 0 

1 if i = 1 
Ui -2 ~ qi-lUi-i if i > 2 


Ui = 
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and 


Vi = 


1 if i = 0 

0 if / = 1 

Vi-2 - qi-\Vi-\ if ( > 2. 


Now we are going to prove the following theorem which provides a method to find 
u and v , for given two positive integers a = r o and b = r\, such that gcd(a, b) = 
au + bv. 


Theorem 10.8.3 For 0 < i < ft, let us consider ri as defined in the Table 10.1 and 
Ui and Vi as defined above. Then for 0 <i<n 9 rt = r^vt +r\Ui. 

Proof We shall prove the theorem by using the principle of mathematical induction 
on i . The result is trivially true for i = 0 and i — 1 . Let us assume that the result 
is true for all i < ft, where n >2. Now we shall prove the result for i = n. By the 
induction hypothesis, we have 


r n—2 = V n - 2 ro + U n - 2 n 


and 


r n - 1 = v n -\ro + u n -\r\. 


Now, 


r n — r n—2 Qn — lVn—l 

= v n — 2 ro + u n - 2 r\ - q n -i(v n -iro + u n -\n) 

= (v n -2 q n -iv n -i)ro + (u n - 2 q n -\u n -\)r \ 

= rov n + r\u n . 

Hence the result is true for i — n. Consequently, by the principle of mathematical 
induction, the result is true for all i > 0. This completes the proof of the theorem. □ 

A pseudocode for the Extended Euclidean algorithm is presented in Table 10.2. 

Remark By a similar argument as in the case of the Euclidean algorithm, it can also 
be shown that the time required to execute the Extended Euclidean Algorithm is 
0(( log 2 a) 3 ) bit operations. 

Remark The Extended Euclidean algorithm may be used to determine the mul- 
tiplicative inverse of a given positive integer a modulo n , if it exists. Let 
gcd(a,n) = 1. Then using the Extended Euclidean algorithm we can find u and 
v such that 1 = au + nv. Then 1 = au (mod ft), with the result that (ft) is the multi- 
plicative inverse of ( a ) in Z*. 
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Table 10.2 Extended 
Euclidean algorithm: given 
two positive integers a and b, 
this algorithm returns r, u, 
and v such that 
r — gcd {a, b) =au + bv 


Input: Two positive integers a and b. 

Method: 


1 . ao ^ a 

2. b 0 <- b 

3. Vo — 0 

4. u 1 

5. wo — 1 

6. w 0 


7 - 

8. r ao ~ qbo 

9. while r > 0 


temp vo — qv 
Vo V 


v temp 
temp ^ vo — qv 


10 . 


11 . 

12 . 


do 


U0 <r~ U 

u temp 
ao <— bo 


bo 



[ r <r- ao - qbo 
r ^ bo 

return r,u, and v 


10.8.7 Modular Arithmetic 


If we can find efficient algorithms for the basic arithmetic operations over the set 
of integers, then immediately we can find efficient algorithms for the corresponding 
operations modulo some positive integer n. We note the following: 

Suppose n is a k - bit integer and 0 <m\, m 2 <n — 1. Then 

• the computation of (mi + m 2 ) (mod ft) can be done in time 0(k); 

• the computation of (mi — m 2 ) (mod ft) can be done in time 0(k); 

• the computation of (mi • m 2 ) (mod ft) can be done in time 0(k 2 ); 

• the computation of (mj -1 ) (mod ft) can be done in time 0(k 3 ). 


10.8.8 Square and Multiply Algorithm 


We are now going to compute a function of the form x c (mod ft) for given non- 
negative integer c and positive integer n. Computation of x c (mod ft) may be done 
using c — 1 modular multiplications. Note that c might be as big as </>(n) which 
is almost as big as n and exponentially large compared to the size of ft. So, this 
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Table 10.3 Square and 
multiply algorithm to 
compute x c (mod ft) 


Input: Three positive integers x, c, and n. 

Method: 

1. ft <r~ 1 

2. i<r-l- 1 

3. while i > 0 

a <- a 2 (mod ft) 
if Ci = 1 

4. do 

then a (a * x) (mod ft) 
i i — 1 

5. return ft 


Table 10.4 Steps to calculate 
201 3 210 (mod 3197) using 
square and multiply algorithm 


i 

Ci 

ft 

1 

1 

41 2 * 2013 (mod 3 197) = 2013 

6 

1 

20 13 2 * 2013 (mod 3 197) = 1774 

5 

0 

1774 2 (mod3197) = 1228 

4 

1 

1228 2 * 2013 (mod 3 197) = 1110 

3 

0 

1110 2 (mod3197) = 1255 

2 

0 

1255 2 (mod3197) = 2101 

1 

1 

2101 2 * 2013 (mod 3 197) = 55 

0 

0 

55 2 (mod 3 197) = 3025 


method is very inefficient, if c is very large. However, there exists an efficient al- 
gorithm, known as square and multiply algorithm, which reduces the number of 
modular multiplications required to compute x c (mod ft) to at most 21, where / is 
the number of bits in the binary representation of c. Let the binary representation of 
c be (q_iq _2 • • • C 2 CO 2 , where c = and q e {0, 1}. Let us first explain 

the algorithm to compute x c (mod ft) in Table 10.3. 

Let us now illustrate the square and multiply algorithm through the following 
example. 

Example 10.8.4 Using square and multiply algorithm, let us compute 20 13 210 
(mod3197). Note that the binary representation of 210 is (11010010)2. Thus here 
/ = 8. The steps are explained in Table 10.4. Thus 2013 210 (mod3197) = 3025. 

Remark Note that in the square and multiply algorithm, the number of squaring 
performed is /. The number of modular multiplications of type a <r- a * x (mod ft) 
is equal to the number of 1 present in the binary representation of c. Thus, 
O(log 2 c(log 2 0 2 ) bit operations are required to execute square and multiply al- 
gorithm to compute x c (mod ft). 
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Next we deal with one of the most important problems in number theory, known 
as “primality testing”. Algebraic techniques are very much useful in primality test- 
ing. In the next section, we show how algebra can be an important tool in primality 
testing. 


10.9 Primality Testing 

Due to the great invention of public key cryptography at the end of the 20th century, 
the interest in primality testing i.e., checking whether a number is prime or not, has 
grown rapidly in the past three decades. The security of this type of cryptographic 
schemes primarily relies on the difficulty involved in factoring the product of very 
large primes. Integer factorization poses many problems, a key one being the test- 
ing of numbers for primality. A reliable and fast test for primality would help us 
in constructing efficient and secure cryptosystem. Therefore, the mathematics and 
computer science communities have begun to address the problem of primality test- 
ing. A primality test is simply a function that determines if a given integer greater 
than 1 is prime or composite. The following theorem plays an important role in 
primality testing. 

Theorem 10.9.1 If n is a positive composite integer ; then n has a prime divisor not 
exceeding «Jn. 

Proof As n is composite, we can write n = ab , where 1 < a < b < n. We claim that 
a < y/n. If not, then b > a > yfn which implies n = ab > y/riyfn = n, a contra- 
diction. Hence, a < +Jn. Again, as a > 1, by Theorem 10.2.1, a must have a prime 
divisor, say p which is clearly less than or equal to yfn. □ 

Though the recent interest in primality testing grew at the end of the 20th century, 
the quest for discovering a good test is by no means a new one, and very likely one 
of the oldest issues in mathematics. The ancient priests in Uruk the Sheepfold, circa 
2500 B.C., are known to have inscribed long lists of prime numbers. Both the ancient 
Greeks and the ancient Chinese independently developed primality testing. 


10.9.1 Deterministic Primality Testing 

One of the simplest and most famous primality test is the Sieve of Eratosthenes who 
lived in Greece at around circa 200 B.C. His method for determining primality is 
as follows. Suppose we want to determine if n is prime or not. First, we make a 
list of all integers 2, 3, ... , \y/n\ . Next, we circle 2 and cross off from the list all 
multiples of two. Then we circle 3 and cross off its multiples. We now continue 
through the list, each time advancing to the least integer that is not crossed off, 
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circling that integer and crossing off all its multiples. We then test to see if any 
of the circled numbers divide n. If the list of circled numbers is exhausted and no 
divisor is found, then n is prime. This algorithm is based on the simple observation 
that, if n is composite, then n has a prime factor less than or equal to *Jn. Though 
the algorithm itself is fairly straightforward and easy to implement, it is by no means 
efficient. 

Next we are going to prove some of the results that will lead to the notions of 
deterministic primality testing. Among these results, Fermat’s Little Theorem plays 
an important role. Recall that this theorem states that if p is a prime and a is any in- 
teger such that p does not divide a , then a p ~ l = 1 (mod p ). However, the converse 
of this theorem is not true. There exist composite integers A and an integer a such 
that a N ~ l = 1 (mod A). These numbers play an important role in primality ques- 
tions. Nevertheless, a partial converse of Fermat’s Little Theorem was discovered 
by Lucas in 1876. The following proposition may be used for primality test. 

Proposition 10.9.1 Let A > 1. Assume that there exists an integer a > 1 such that 

1. a N ~ l = 1 (mod AO; 

2. a m ^ 1 (mod A), for m = 1, 2, ..., A — 2. 

Then A is a prime integer. 

Proof Let us consider the group of units (Z^, •)• Then the order of 7A N is 0(A). As 
a N ~ l = a N ~ 2 - a = 1 (mod A), a is a unit in the ring (Z#, +, •)• Thus (a) e Z^. 
To show that A is prime, it is sufficient to show that 0(A) = A — 1, where <fi(n) is 
the Euler 0 function. By hypothesis, order of (a) in the group 7A N is A — 1. As the 
order of an element in a finite group divides the order of the group, A — 1 divides 
0(A). As A — 1 < 0(A), we have 0(A) = A — 1. Hence A is prime. □ 

Remark Though it might seem perfect, it requires A — 2 successive multiplications 
by a , and finding residues modulo A. So, for large A, the problem remains. 

In 1891, Lucas gave the following test for primality: 

Proposition 10.9.2 Let A > 1 . Assume that there exists an integer a > 1 , relatively 
prime to A such that 

1. a N ~ x = 1 (modA); 

2. a m ^ 1 (mod A), for every m < A — 1 such that m divides A — 1. 

Then A is a prime integer. 

Proof The proof follows directly from Proposition 10.9.1. □ 

Remark To apply the proposition, it requires to know all factors of A — 1 . So, this 
test may only be easily applicable when A — 1 can be factored. For example, A = 
2 n + 1 or A = 3 • 2 n + 1 etc. 
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Brillhart, Lehmer, and Selfridge came up with the following test in 1975: 

Proposition 10.9.3 Let A > 1 . Assume that for every prime divisor q of A — 1 , 
there exists an integer a > 1 such that 

1. a N ~ l = 1 (mod AO; 

2. ^ 1 (modAO. 

77z£/2 TV A prime. 

Proof As before (a) e 7A N . To show that A is prime, it is sufficient to show that 
A — 1 divides 0(A). If possible, suppose A — 1 does not divide 0(A). Then there 
exist a prime factor p of A — 1 and a positive integer ft such that p n divides A — 1 
but does not divide 0(A). Let A — 1 = X\p n , for some integer X\ . Further let the 
order of (a) in Zf N be e. As by hypothesis, ( a) N ~ l = (1), e divides A — 1 = X\p n 
and e does not divide (A — 1 )/p = X\p n ~ l . Let A — 1 = X\p n = eX 2, for some 
integer X 2 . Thus X\p n = X\p n ~ l p = eX 2 implies p divides ^^2. As p is prime, 
either p divides X 2 or p divides e. Let p divide X 2 . Then X 2 = pX 3 for some 
integer A3. Thus X\p n = X\p n ~ l p = A2^ implies Ai//* -1 = ^A3 which implies 
e divides X\ p n ~ l = (N — l)/p, a. contradiction. Thus p does not divide A2. As p n 
divides ^A2, and p does not divide A2, p n must divide e. Also as (a)^^ = (1) 
and order of (a) in Z * N is e , e divides 0(A), with the result that p n divides 0(A), 
a contradiction. Hence the proposition follows. □ 

Remark To apply the proposition, once again, it requires the knowledge of all 
factors of A — 1. But here fewer congruences have to be satisfied than Proposi- 
tion 10.9.2. 


10.9.2 AKS Algorithm 

The AKS Algorithm is the first deterministic polynomial-time primality test named 
after its authors M. Agrawal, N. Kayal, and N. Saxena. In August 2002, this al- 
gorithm was presented in the paper “PRIMES is in P” Agrawal et al. (2002). 
This solved the longstanding problem of solving the primality testing determinis- 
tically of an integer A in polynomial time. The running time (with fast multipli- 
cation), was originally evaluated as essentially O ((log A) 12 ) and lately lowered to 
O((logA0 75 ). 

The main idea in the new primality testing algorithm is the following identity 
characterizing primes: 

A is prime if and only if (1 — A)^ = 1 — A^ (mod A). 

For more detailed result, the readers may refer the book Dietzfelbinger (2004). 
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10.10 Probabilistic or Randomized Primality Testing 

The main aim of this section is to bring out some of the efficient but simple prob- 
abilistic tests for primality by using group theoretic results. The first question that 
comes to our mind is: “What do we mean by a probabilistic test ?” The next imme- 
diate question may be: “Is there any advantage of using probabilistic or randomized 
algorithm over deterministic algorithm ?” 

First let us try to explore some of the advantages of the use of probabilistic or 
randomized algorithms over the deterministic algorithms in the context of primality 
testing. Suppose we are given an integer n > 1 and we want to determine whether 
the given integer n is prime or composite. If n is composite, Theorem 10.9.1 ensures 
the existence of a prime factor of n within the range from 2 to L V^J » the greatest 
integer not exceeding y/n. So, we simply divide n by 2, 3, up to \_y/n\ » testing if any 
of these numbers divide n. Note that this deterministic algorithm does, not only the 
primality testing for n, but also it produces a non-trivial factor of n, whenever n is 
composite. Of course, the major drawback of this deterministic algorithm is that it 
is terribly inefficient as it requires 0{^Jn) arithmetic operations, which is exponen- 
tial in the binary length of n. Thus for practical purposes, this algorithm is limited 
only to small values of n. Let us try to see what happens when this algorithm is 
used to test the primality of an integer n having 100 decimal digits in a computer 
that can perform 1 billion divisions per second. Then to perform divisions, it 
would take on the order of 10 33 years which is quite impractical. So the question 
that comes to our mind is “does there exist a deterministic primality testing algo- 
rithm which is efficient?” In 2002, Agrawal et al. (2002) first came up with a path 
breaking work which provides a deterministic algorithm to check whether a given 
number is prime or composite in polynomial time. However, we must admit that 
till date no deterministic primality testing algorithm is known which is as efficient 
as the probabilistic primality testing algorithms, such as Solovay-Strassen primal- 
ity testing or Miller-Rabin primality testing. In this section, we shall develop those 
probabilistic primality tests that allow 100 decimal digit numbers to be tested for 
primality in less than a second. One important thing we should note that these al- 
gorithms are probabilistic, and may make mistakes. However, the probability that 
they commit a mistake can be made so small as to be irrelevant for all practical pur- 
poses. For example, we can easily make the probability of error as small as 2 -100 : 
should one really care about an event that happens with such a negligible probabil- 
ity? Let us now try to answer the question “what do we mean by a probabilistic or 
randomized test ?” For better understanding we need to know two notions, namely, 
decision problem and randomized algorithm. A decision problem is a problem in 
which a question is to be answered “yes” or “no” while a randomized algorithm is 
any algorithm that uses random numbers, in contrast, an algorithm that does not use 
random numbers is called a deterministic algorithm. For the randomized algorithms, 
the following definition plays an important role. 

Definition 10.10.1 For a decision problem, a yes-biased Monte Carlo algorithm 
is a randomized algorithm in which a “yes” answer is always correct but a “no” 
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Table 10.5 The decisional 

problem: Is-Composite Problem: Is-Composite. 

Instance: A positive integer n > 2. 
Question: Is n composite? 


Table 10.6 

Solovay-Strassen primality In P ut: ° dd mte S er n ^ 3 ' 

test Method: 

1. Choose an integer a randomly from {1,2, 3, ... ,n — 1}. 

2. if gcd(a, n) ^ 1 

3. return (“Yes: The number is a composite integer”). 

4. * = ({}), the Jacobi symbol of a modulo n 

5. y = a( n-1 )/ 2 (modn) 

6. if x = y (mod ft) 

7. return (“No: The number is a prime integer”). 

8. else 

9. return (“Yes: The number is a composite integer”). 


answer may not be correct. Similarly, for a decision problem, a no-biased Monte 
Carlo algorithm is a randomized algorithm in which a “no” answer is always correct 
but a “yes” answer may not be correct. 

As we see that for a yes-biased or no-biased Monte Carlo algorithm, there is a 
chance of getting error in the output, the following definition is very important for 
any Monte Carlo algorithm. 

Definition 10.10.2 For a decision problem, a yes-biased Monte Carlo algorithm 
has an error probability € if for any instance the answer is “yes”, but the algorithm 
will give an incorrect answer “no” with probability almost 6, where the probability 
is taken over all random choices made by the algorithm when it is run with a given 
input. 

One of the very important decision problems, known as “Is-Composite” is de- 
scribed in Table 10.5. 

As the answer to the problem “Is-Composite” is either “yes” or “no”, the problem 
is a decisional problem. In the next section, we are going to provide some efficient 
yes-biased Monte Carlo algorithms that will provide the answer in an efficient way. 


10.10.1 Solovay-Strassen Primality Testing 

The primality test of Solovay and Strassen is a randomized algorithm based on the 
theory of quadratic reciprocity. This test is capable of recognizing composite num- 
bers with a probability of at least \ . Let us first explain the algorithm in Table 10.6. 



468 


10 Algebraic Aspects of Number Theory 


First note that as the gcd algorithm, the square and multiply algorithm and the 
algorithm to compute the Jacobi symbol can be done in time O ((log ft) 3 ), the 
Solovay-Strassen Primality testing algorithm can run in time O ((log ft) 3 ). Now, we 
shall show that Solovay-Strassen Primality testing algorithm is a yes-biased Monte 
Carlo Algorithm with error probability at most \ . To prove this we shall use the 
following proposition and lemma. 

Proposition 10.10.1 Let n be an odd integer greater than 1. Let SS(n ) = {a e 
Z* |(^) = a^ n ~ 1 ^ 2 (mod ft)}. Then SS(n ) is a subgroup of Z*. 

Proof As (1) g SS(n), SS(n) + 0. Let a, be SS(n). Then (*) = a^ 1 ^ 2 (modft) 
and ( b n ) = b^-V/ 2 (modft). Now, ( ab ) = (*)(*) = a^ n ~ 1 ^ 2 (mod ft)&M>/ 2 
(mod ft) = (ab)( n ~ 1 ^ 2 (modft). Thus ab e SS(n). So, SS(n) is closed under compo- 
sition modulo ft. As SS (ft) is a finite subset of Z*, SS (ft) is a subgroup of Z*. □ 

Our next aim is to show that SS(n) is a proper subgroup of Z*. To prove that we 
shall take the help of the following lemmas. 

Lemma 10.10.1 Let n = p k m be an odd integer greater than 1 with p an odd 
prime , gcd (p, m) = 1 and k > 2. Then there exists an element (g) e Z* such that 
(*) ^g (n_1)/2 (modft). 


Proof As 0(ft) = p k ~ l (p — 1)0 (/ft), p divides <p(n) which is the order of the group 
Z*. Thus by Cauchy’s Theorem, there exists an element, say (g) e Z* such that 
order of (g) is p. We claim that (^) # g(«-i)/ 2 (mod ft). If possible, let (^) = 
g{n- 1)/2 ( mo dft). As (g) g Z*, gcd(g,ft) = 1. Now the value of (^) is either +1 
or —1. Thus gh 1-1 )/ 2 = ±1 (modft) implies g n ~ l = 1 (modft). Hence order of (g) 
divides n — 1, i.e., p divides ft — 1. Again, as ft = /?^/ft, p divides ft, a contradiction. 
Thus (^ ) # gh*- 1 )/ 2 (modft). This completes the proof. □ 

Lemma 10.10.2 Let n = p\P 2 • . . Pk, where pi ’s are distinct odd primes. Suppose 
a = n (mod p\) and a = l(mod P2P?> • • . Pk), where u is a quadratic non-residue 
modulo p\. Then (*) ^a^ n ~ 1 ^ 2 (mod ft). 

Proof Note that (“) s (“,)(„.%*) (,Tlod «) s (pi ( mod "> = 

(mod ft). We claim that ft ~ # (^ ) (mod ft). If possible let a = (^ ) (mod ft). 

ft— i 

Then ft~ = (— 1) (mod ft) = (—1) (mod pip2 mmm Pk ), contradicting the fact that 
ft V - = 1 (mod p 2 • • • /?&)• Thus ft V" ^ ) (mod ft). This completes the proof. □ 

We are now in a position to prove the following theorem. 

Theorem 10.10.1 The Solovay-Strassen Primality testing algorithm is a yes- 
biased Monte Carlo Algorithm with error probability at most ^ . 
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Proof Clearly, the Solovay-Strassen primality testing algorithm is a yes-biased 
Monte-Carlo algorithm. Note that the error in the algorithm may occur when n is a 
composite integer but the algorithm returns “no”, i.e., when a e SS(n). So we need 
to find a bound on the size of SS(n). To get that we first show that SS(n) is a proper 
subgroup of Z*. As n is an odd positive integer, the form of n is either n = p k m 
with p an odd prime, gcd (p, m) = 1 and k > 2 or n = p\pi. ..?k, where pi ’s are 
distinct odd primes. In both the cases, from Lemmas 10.10.1 and 10.10.2, we see 
that there exists an element, say (g), such that (g) e Z* but (g) £ SS(n), it resulting 
that SS(n ) is a proper subgroup of Z*. As order of SS(n) divides the order of Z*, let 

^ss(n)\ = t ' Then t > 2. Thus |SS(ft)| = ^ ^ Consequently, the error 

probability of the algorithm is at most 2 {n-l) D 


10.10.2 Pseudo-primes and Primality Testing Based on Fermat's 
Little Theorem 


Fermat’s Little Theorem 10.4.2 ensures us that if p is a prime and a is any integer 
such that 1 < a < ft, then a p ~ l = 1 (mod p). Consequently, for a given ft, if we 
can find an integer b such that 1 < b < n and b n ~ l ^ 1 (mod ft) then we know 
that ft must be composite. We call this number b , 1 < b < ft, as an F-witness for 
ft. So, if ft has an F-witness, n must be a composite number, i.e., the F-witness 
b for ft provides a certificate for the compositeness of ft. It can be shown that 2 
is an F-witness for all composite numbers not exceeding 340. However, 2 340 = 
1 (mod 341), even though 341 = 11 • 31 is a composite number. So 2 is not an F- 
witness for 341. We call 2 as F-liar for 341. So in general, for an odd composite 
number ft, an integer a, 1 < a < ft, is called an F-liar if a n ~ l = 1 (mod ft). Thus 2 
is a F-liar for 341 while 3 is an F-witness for 341. So we see that there exist some 
composite integers which pass Fermat’s test for some base. This observation leads 
to the definition of a special type of numbers, known as pseudo-primes. Let us first 
define what is a pseudo-prime to some base. 

Definition 10.10.3 Let ft be a positive composite integer and ft be a positive integer. 
Then the integer n is said to be a pseudo-prime to the base a if a n = a (mod ft). 

Remark 1 Thus a positive odd composite integer ft is a pseudo-prime to some base 
ft, if ft is an F-liar for ft. 

Remark 2 If gcd(ft, ft) = 1, then the congruence a n = a (mod ft) is equivalent to 
the congruence a n ~ l = 1 (mod ft). 

Example 10.10.1 The integers 341 = 11 x 31, 561 = 3 x 11 x 17, 645 = 3 x 5 x43 
are examples of pseudo-primes to the base 2. 
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It has been found that the pseudo-primes to some fixed base a is much rarer than 
prime numbers. In particular, in the interval (2, 10 10 ), there are 455052512 primes 
while there are only 14884 pseudo-primes to the base 2. Although pseudo-primes 
to any given base are rare, there are, nevertheless, infinitely many pseudo-primes to 
any given base. Here we shall prove this for the base 2 only. The following lemmas 
are useful to prove the result. 

Lemma 10.10.3 Ifk and n are positive integers such that k divides n , then 2 k — 1 
divides 2 n — 1. 

Proof Let n = kq for some positive integer q. Then 2 n — 1 = 2 kq — 1 = (2 k — 
1)((2^)^ _1 + ( 2 k ) q ~ 2 -| hi) which implies that 2^ — 1 divides 2 n — □ 

Lemma 10.10.4 If an odd positive integer n is a pseudo-prime to the base 2, then 
2 n — 1 is also a pseudo-prime to the base 2. 

Proof Lemma 10.10.3 ensures that A = 2 n — 1 is a composite integer. As 
gcd(2, A) = 1, to show that N is a pseudo-prime, we only need to show that 
2 n ~ 1 = 1 (mod A). Since by hypothesis, n is a pseudo-prime to the base 2, 
2 n ~ l = 1 (mod ?i). Thus there exists some integer k such that 2 n ~ l — 1 = kn. Now 
2 n ~ 1 = 2 2 ” -2 = 2 lkn . As ft divides 2 kn, by Lemma 10.10.3, A = 2 n — 1 divides 
2 lkn — 1 = 2 n ~ 1 — 1, with the result 2 N ~ l = 1 (mod A). Hence A = 2 n — 1 is a 
pseudo-prime to the base 2. □ 

Now we are in a position to prove the existence of infinitely many pseudo-primes 
to the base 2. 

Theorem 10.10.2 There are infinitely many pseudo-primes to the base 2. 

Proof Note that 341 is an odd pseudo-prime to the base 2. Now by Lemma 10.10.4, 
we will be able to construct infinitely many odd pseudo-primes to the base 2 by 
taking ps\ = 341, ps 2 = 2 psi — 1, ps?, = 2 ps 2 — 1, Note that these odd inte- 

gers are all distinct since ps\ < ps 2 < ps 3 < • • • . Therefore, the infinite sequence 
ps 1 , ps 2 , . . . shows the existence of infinitely many pseudo-primes to the base 2. □ 

Thus we see that while testing the primality of an integer ft, if we find that 2 n ~ l = 
1 (mod ft), then we know that n is either prime or a pseudo-prime to the base 2. 
One follow-up approach may be taken by testing whether a n ~ l = 1 (mod ft) for 
various positive integers ft. If we find any value b, 1 < b < n with gcd (b, ft) = 1 and 
b n ~ l ^ 1 (mod ft), then we can readily say that n is composite. For example, we 
have seen that 341 is a pseudo-prime to the base 2. But, 3 340 = 56 1 (mod 341), 

with the result that 341 is a composite integer. 

Now the question that comes to our mind is: “exploiting the above fact, can we 
develop a yes-biased Monte Carlo algorithm for the problem Is-Composite?” 

For that let us first prove the following useful result which is some kind of inverse 
of Fermat’s Little Theorem. 


10. 10 Probabilistic or Randomized Primality Testing 


471 


Table 10.7 Fermat’s test 

Input: Odd integer n > 3. 

Method: 

1. Choose an integer a randomly from {2, 3, . . . , n — 2}. 

2. if (a n ~ l # 1 (modn)) 

3. return (“Yes: The number is a composite integer”). 

4. else 

5. return (“No: The number is a prime integer”). 


Theorem 10.10.3 For an integer n > 2, if a n 1 = 1 (modn) for all a, 1 < a < n , 
then n must be a prime number. 

Proof As n > 2, for all a, 1 < a < n, a • = 1 (modn) implies that (a) 

has an inverse in the ring Z w , i.e., (a) £ Z*. Thus n and n are relatively prime for all 
n, 1 < n < n. Hence n must be a prime integer. □ 

Theorem 10.10.3 ensures that if n is a composite integer, then there must exist 
at least one F-witness for n. Further note that as (1)^ _1 = 1 (modn) and (n — 
l) n ~ l = (— l) n ~ l (mod n) = 1 (modn) for all odd integer n > 3, 1 and n — 1 are 
always F -liars for n, called the trivial F -liars for n. The above observation leads to 
the randomized primality testing algorithm known as Fermat’s Test as described in 
Table 10.7. 

Now we shall show that the algorithm is a yes-biased Monte Carlo algorithm for 
n, provided there is at least one F-witness a for n such that gcd(n, n) = 1. To prove 
that let us first prove the following lemma. 

Lemma 10.10.5 For an odd composite integer n > 3, if there exists at least one 
F -witness c for n such that gcd(c, n) = 1, then the set of all F -liars for n, i.e., 
C „ = {{a) eZ w : 1 <a < n and a n ~ l = 1 (modn)} is a proper subgroup of Z*. 

Proof Clearly, (1) e and hence C „ is a non-empty subset of Z*. Now let 
(a), (b) £ . Then a n ~ l = 1 (modn) and b n ~ l = 1 (modn). Thus ( ab) n ~ l = 

a n-lpn-i = i (modn), resulting in ( ab ) £ . Hence from the property of finite 

group, C „ is a subgroup of Z*. Further, as there exists at least one F-witness, say 
c for n such that (c) £ Z* and c n ~ l ^ 1 (modn), it follows that (c) ^ . Hence 

r F C 7* n 


Now we shall prove the following theorem. 

Theorem 10.10.4 If n > 3 is an odd composite integer such that there exists at 
least one F -witness a for n such that (a) £ Z*, then the algorithm as described in 
Table 10.7 is a yes-biased Monte Carlo algorithm with error probability at most \ . 

Proof Clearly the algorithm as described in Table 10.7 is a yes-biased Monte Carlo 
Algorithm. Now the error occurs when (a) £ \{(l),(n — 1)}. As ^ Z*, 
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Table 10.8 Repeated 


Input: Odd integer n > 3 and an integer k>\. 

Method: 


Fermat’s test 


1 . repeat k times 

2. Choose an integer a randomly from {2,3, ... ,n — 2} 

3. if ( a n ~ l ^ 1 (mod ft)) 

4. return (“Yes: The number is a composite integer”). 

5. return (“No: The number is a prime integer”). 


\C„\ is a proper divisor of |Z*| = 0(ft) < n — 1. Thus \C„\< As a result, the 

probability that an a randomly chosen from {2, 3, . . . , n — 2} is in \ {(1), (ft — 1)} 



□ 


is at most 


n-3 — 2{n-3) ^ 2 * 


Remark To increase the confidence level of the primality testing as described in 
Table 10.7, we may repeat the process as described in Table 10.8. 


Remark Note that if the algorithm as described in Table 10.8 outputs “Yes: The 
number is a composite integer”, then the algorithm finds an F-witness for n and 
hence n must be a composite integer. On the other hand, if n is composite and there 
exists at least one F-witness for n in Z*, then the error probability in all k attempts 
is at most (\) k . Thus by choosing k large enough, the error probability can be made 
as small as desired. 

Unfortunately, there are composite integers that cannot be shown to be composite 
using the above approach, because there are integers which are pseudo-primes to 
every relatively prime base, that is, there are composite integers n such that b n ~ l = 
1 (mod ft), for all b relatively prime with n. This leads to the following important 
definition. 

Definition 10.10.4 A composite integer n which satisfies a n ~ l = 1 (mod ft) for all 
positive integer a with gcd(ft, ft) = 1 is called a Carmichael number named after R 
Carmichael who studied them in the early part of the 20th century. 

Example 10.10.2 The integer n = 561 is the smallest Carmichael number. 

In 1912, Carmichael conjectured that there exist infinitely many Carmichael 
numbers. After 8 years, Alford, Granville and Pomerance showed the correctness 
of the conjecture. As the proof of the conjecture is out of the scope of this book, we 
are not discussing that here. However, the following theorems provide some useful 
properties of Carmichael number. 


Theorem 10.10.5 Let n = p\pi • • • Pk , where pi ’s are distinct primes such that 
Pi — 1 divides ft — 1 for alii 6 {1, 2, ... , k}. Then n is a Carmichael number. 


10. 10 Probabilistic or Randomized Primality Testing 


473 


Proof Let a be a positive integer such that gcd(a, n) — 1. Then gcd(a, Pi) — 1 for 
all i e {1, 2, , k}. Hence by Fermat’s Little Theorem 10.4.2, = 1 (mod pi) 

for all i which implies a n ~ l = 1 (mod pi) for all i as pi — 1 divides n — 1, by 
hypothesis. The rest of the theorem follows from Chinese Remainder Theorem. □ 

Example 10.10.3 Using Theorem 10.10.5, we can conclude that n = 6601 = 7 • 23 • 
41 is a Carmichael number. 

Now we shall show that the converse of the Theorem 10.10.5 is also true. To 
prove that we need some concepts from group theory. For an abelian group (G, •), 
we say that an integer k kills the group G iff G k = {e}, where e is the identity 
element of G. Let Kg = {k eZ\k kills G}. Then clearly Kg is a subgroup of the 
group (Z, +) and hence of the form nZ for a uniquely determined non-negative 
integer n. This integer n is called exponent of G. Note that if n / 0, then this n is 
the least positive integer that kills G. Further note that for any integer k such that 
G k = { e }, the exponent n of G divides k. Also, if G is a finite cyclic group, then the 
exponent of G is same as the order of the group G. Using the concept of exponent 
of the group, we prove the following theorem. 

Theorem 10 . 10.6 The converse of the Theorem 10.10.5 is also true. 

Proof Let n be a Carmichael number. Then n is a composite integer and a n ~ l = 1 
(mod n ), for all ( a ) e Z*. Hence n — 1 kills the group Z*. Let n = p® 1 p® 2 • • • p® k 
be the prime factorization of n , where pf s are distinct primes and ai > 1, for all 
i = 1, 2, . . . , k. Now by Chinese Remainder Theorem, we have Z* is isomorphic to 

the group Z* ai x Z*« 2 x • • • x Z* ak , where each of the Z* a/ is a cyclic group of 

p i Pi Pk Pi 

order p ( * l ~ l (pi — 1). Thus n — 1 kills the group Z* iff n — 1 kills each the groups 

Z* a . , i.e., iff p^ l ~ l (pi — 1) divides n — 1, for all i = 1, 2, . . . , k. Now we claim that 

P i 

ai = 1, for all i = 1, 2, . . . , k. If not, then there exists pi, for some i such that pi 
will divide both n — 1 and n , a contradiction. Thus a* = 1 for all i = 1, 2, . . . , k and 
hence n = P\P2 • • • Pk and pi — 1 divides n — 1, for all i = 1, 2, . . . , k. □ 

The following theorem also provides useful information about Carmichael num- 
bers. 

Theorem 10 . 10.7 A Carmichael number must have at least three distinct odd prime 
factors. 

Proof From Theorem 10.10.6, it follows that if an integer n > 2 is a Carmichael 
number, then n = p\P 2 • ••/?&, where pf s are distinct odd primes such that pi — 1 
divides n — 1 for all / (={1,2,..., k}. We need to show that k > 3. As n is a positive 
composite integer, k > 1. If possible, let k = 2, i.e., let n = pq, where /? and g are 
distinct primes. Without loss of generality, let us assume that p > q. Now p — \\n — 
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1 = pq — 1 = q(p — 1) -f (q — 1). Thus p — 1 \q — 1, a contradiction. So, n must be 
a product of at least three distinct odd primes. □ 

In the next section we are going to introduce another probabilistic primality test- 
ing based on finding non-trivial square root of 1 modulo ft. 


10.10.3 Miller-Rabin Primality Testing 

If n is a Carmichael number, then the probability that Fermat’s test for “Is- 
Composite” returns a wrong answer is ^^3 2 > ^7^ = Yl P \n^ ~ p) which is very 
close to 1, if n has only few and large prime factors. As a result, the repetition trick 
for reducing the error probability does not hold. So we need to search for other 
techniques. 

The next primality testing algorithm is based on finding non-trivial square root of 
1 modulo ft. First let us try to explain what do we mean by a square root of 1 modulo 
ft. An integer a, 1 < a < n is called a square root of 1 modulo n if a 2 = 1 (mod ft). 
Note that, 1 and n — 1 are always square roots of 1 modulo ft. Thus these numbers 
are called trivial square roots of 1 modulo ft. Recall that the Corollary of Theo- 
rem 10.5.1 ensures that if p is an odd prime number then 1 has no non-trivial square 
root modulo p. Thus to develop a primality testing algorithm, we may use the fact 
“If there exists a non-trivial square root of 1 modulo ft, then n must be a composite 
number.” Before developing a “yes-biased” Monte Carlo algorithm for the problem 
“Is-Composite”, let us first look at the Fermat’s primality testing little closely. As 
we are interested in odd ft, let us assume that n = 2 h • t + 1, where gcd(2, t) = 1 and 
h > 1. Thus a n ~ l = ((< a 1 (mod ft)) 2 ) (mod ft) implies that a n ~ l (mod ft) may be 
calculated in h + 1 intermediate steps if we let bo = a 1 (mod ft); bt = b 2 _ x (mod ft), 
for i = 1,2 , ,h. Thus bh = a n ~ l (mod ft). Now let us try to explain through an 
example how this observation may be helpful in developing a probabilistic primality 
testing. 

Let ft = 325. Then n — 1 = 81 • 2 2 . In the Table 10.9, we calculate the powers bo = 
ft 81 (mod ft), b\ = (ft 81 ) 2 = a 162 (mod ft), Z?2 = <2 324 (mod ft) for a list of different 
values of a, 2 < a < n — 1 and show that how this table plays an important role for 
primality testing. 

Note that 2 is an F-witness for 325 having gcd(2, 325) = 1 while 130 is also an 
F - witness with gcd(130, 325) = 65 > 1. Note that 7, 18, 32, 118, 126, 199, 224 and 
251 are all F-liars for 325. So, these numbers will not help us in the primality test- 
ing using Fermat’s test. But, if we start calculation with 118, 224 and 251, we see 
that 274, 274 and 51 are non-trivial square roots of 1 modulo 325. This observation 
directly implies that 325 is a composite number. On the other hand, the calculation 
with 7, 18, 32 and 199 provide no information regarding primality testing based on 
non-trivial square root modulo 1 as — 1 = 324 (mod 325) is a trivial square root mod- 
ulo 325 and 7 162 = -1 (mod325), 18 162 = -1 (mod325), 32 162 = -1 (mod325) 
and 199 81 = —1 (mod325). Likewise, calculation with 126 does not provide any 
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Table 10.9 

Calculation of a 324 (mod 325) for different base values of a 


a 

bo = a 81 (mod 325) 

bi =(a 162 ) (mod 325) 

Z ?2 = a 324 (mod 325) 

2 

252 

129 

66 

1 

307 

324 

1 

18 

18 

324 

1 

32 

57 

324 

1 

118 

118 

274 

1 

126 

1 

1 

1 

130 

0 

0 

0 

199 

324 

1 

1 

224 

274 

1 

1 

251 

51 

1 

1 


Table 10.10 General form of the sequence of bi ’s, where denotes an arbitrary element from 
{2, 3,..., 7i -3} 


SN. 

bo 

b i 






bh - 1 

bh 

Information 

1 

1 

1 


1 

1 

1 


1 

1 

No information 

2 

-1 

1 


1 

1 

1 


1 

1 

No information 

3 

* 

1 


1 

1 

1 


1 

1 

n is composite 

4 

* 

-1 


1 

1 

1 


1 

1 

No information 

5 

* 

* 


1 

1 

1 


1 

1 

n is composite 

6 

* 

* 


-1 

1 

1 


1 

1 

No information 

7 

* 

* 


* 

1 

1 


1 

1 

n is composite 

8 

* 

* 


* 

-1 

1 


1 

1 

No information 

9 

* 

* 


* 

* 

1 


1 

1 

n is composite 

10 

* 

* 


* 

* 

-1 


1 

1 

No information 

11 

* 

* 


* 

* 

* 


1 

1 

n is composite 

12 

* 

* 


* 

* 

* 


-1 

1 

No information 

13 

* 

* 


* 

* 

* 


* 

1 

n is composite 

14 

* 

* 


* 

* 

* 


* 

-1 

n is composite 

15 

* 

* 


* 

* 

* 


* 

* 

n is composite 


information on primality testing as 126 81 = 1 (mod 325) is also a trivial square root 
modulo 325. So depending on different values of a, we get useful information from 
the sequence of b\ s. Thus we need to know a general form of the sequence of b[ s, 
i = 0,1,2, ... ,h which is described in Table 10.10. 

As Row 1, Row 2, Row 4, Row 6, Row 8, Row 10, and Row 12 of Table 10.10 
do not ensure the existence of any non-trivial square root of 1 modulo n, we get 
no information regarding the primality testing. On the other hand, Row 3, Row 5, 
Row 7, Row 9, Row 11 and Row 13 ensure the existence of non-trivial square root 
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of 1 modulo n , thereby provides the information that n is a composite integer. For 
Row 14 and Row 15, as bh # 1 (mod ft), the corresponding a is an F-witness for n. 
The above observation leads to the following definition. 

Definition 10.10.5 Let n = 2 h • t + 1 > 3 be an odd integer, where gcd(2, 0 = 1 
and h > 1. An integer a, 1 < a < ft, is called an A-witness for n if a f ^ 1 (mod ft) 
and a t2 ' # — 1 (mod ft) for all i, 0 < / < h. On the other hand, if n is composite and 
a is not an A-witness for ft, that is, either a 1 = 1 (modft) or a t2 ' = — 1 (modft) for 
some i, 0 < i <h, then a is said to be an A -liar for ft. 

Example 10.10.4 For n = 325, a =2, 118, 130, 224, 251 are examples of A-witness 
while a = 7, 18, 32, 126 and 199 are examples of A -liars. 

Now we are going to prove the following lemma which will help us in developing 
a probabilistic primality testing. 

Lemma 10.10.6 If a is an A-witness for ft, then n must be a composite integer. 

Proof Let a be an A-witness for ft . Let us consider the sequence bo, b \ , & 2 , • • • , bh-i 
as in the Table 10.10. Now we may have two cases. First let bh ^ 1 (modft). Then 
a must be an F-witness for ft, implying that ft is a composite integer. Now let bh = 
1 (modft). As a is an A-witness, bo # 1 (modft) and n — 1 does not occur in the 
sequence bo, b \ , b ^, . . . , bh -\ . Let i be the minimum i > 1 such that = 1 (modft). 
Such i always exists as bh = 1 (modft). As ^{l,ft — 1}, bi-\ is a non-trivial 
square root of 1 modulo n and thereby ft is a composite integer. □ 

Based on the above lemma, very soon we are going to develop a primality test- 
ing. Before that let us try to find some properties of A -liars. For that the following 
definition plays an important role. 

Definition 10.10.6 A positive integer ft > 3 of the form 2 h t + 1, where gcd(2, t) = 
1, passes Miller’s test for a base a, 1 < a < ft, if either a 1 = 1 (modft) or a 22 1 = 
— 1 (modft), for some i with 0 < i < /z — 1. In other words, ft passes Miller’s test 
for a base ft, 1 < a < n, if a is an A-liar for n. 

Through the following theorems, we try to find some of the properties of those 
odd integers which pass the Miller’s test for some base a. These properties will lead 
to the primality testing algorithm. 

Theorem 10.10.8 Ifn is an odd prime and a is a positive integer such that 1 < a < 
ft, then ft passes Miller's test for the base a. 

Proof Let a k = ft^ -1 )/ 2 ^, k = 0, 1, . . . , h. By Fermat’s Little Theorem, we have 
fto = ft 2 1 = a n ~ l = 1 (modft). Again as a 2 = (ft 2 0 2 = a n ~ l = fto = 1 (modft), 
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by the Corollary of Theorem 10.5.1, it follows that either a\ = 1 (mod ft) or a\ = 

— 1 (mod ft). So, in general, if we have found that fto = a\ = • • • = ai = 1 (mod ft) 

with k <h, then since ft| +1 = ft£ = 1 (mod ft), we have either ft £ + \ = 1 (mod ft) or 
cik + 1 = — 1 (mod ft). Thus continuing this process, we find that either a^ = 1 (mod ft) 
for k = 0, 1, 2, . . . , /z or = — 1 (modft) for some integer &. Hence, n passes 
Miller’s test for the base a . □ 

Theorem 10.10.9 If a composite integer n passes Miller’s test for a base a , then n 
is a pseudo-prime to the base a. 

Proof As ft passes Miller’s test for the base a , either a 1 = 1 (modft) or ft 2 ^ = 

— 1 (modft), for some i with 0 < i < h — 1, where n = 2 h t + 1, h > 1 and 

gcd(2, 0 = 1- Thus ft” = ft” -1 ft = (ft 2 ^) 2 ^ ^ = (il) 2 ' ft = ft (modft), with the 
result that ft is a pseudo-prime to the base ft. □ 

The above theorem leads to the following important definition. 

Definition 10.10.7 If n is a composite integer that passes Miller’s test for a base ft, 
then we say that ft is a strong pseudo-prime to the base ft. 

Example 10.10.5 Let n = 2047 = 23 • 89. Then n — 1 = 2046 = 2 1 • 1023. As both 
22046 = i (mod2047) and 2 1023 = 1 (mod2047), 2047 is a strong pseudo-prime to 
the base 2. 

The following theorem shows the existence of infinitely many strong pseudo- 
primes to the base 2. To prove this let us first prove the following lemma. 

Lemma 10.10.7 Let n be an odd integer which is a pseudo-prime to the base 2. 
Then N = 2” — 1 is a strong pseudo-prime to the base 2. 

Proof As ft is a composite integer, Lemma 10.10.3 ensures that N = 2” — 1 is also 
a composite integer. Also as 2 n ~ l = 1 (modft), there exists some odd integer k 
such that 2 n ~ l = kn + L Now, A — 1 = 2” — 2 = 2(2 n ~ l - 1) = 2 nk and 2” = 
A + 1 = 1 (mod A). Thus 2^~ = 2 nk — ( 2 n ) k = 1 (mod A), which implies that A 
passes Miller’s test for the base 2. Consequently, A is a strong pseudo-prime to the 
base 2. □ 

We are now in a position to prove the existence of infinitely many strong pseudo- 
primes to the base 2. 

Theorem 10.10.10 There are infinitely many strong pseudo-primes to the base 2. 


Proof Lemma 10.10.7 ensures that every odd pseudo-prime to the base 2 yields a 
strong pseudo-prime 2” — 1 to the base 2. The rest of the theorem follows from 
Theorem 10.10.2. □ 
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Table 10.11 Miller-Rabin 
primality test 


Input: Odd integer n > 3. 

Method: 


1. write n — 1 = 2 h t, where h> 1 and gcd(2, t) = 1. 

2. choose a random integer a such that 2 < a < n — 2. 

3. b ^ a 1 (modft) 

4. if b = 1 (modft) 

5. return (“No: the number is a prime integer”). 

6. for i <r- 0 to h - 1 


if Z? = — 1 (modft) 

7. do return (“No: the number is a prime integer”). 

b <-b 2 (modft) 

8. return (“Yes: the number is a composite integer”). 


Remark The smallest odd strong pseudo-prime to the base 2 is 2047. So, if n is 
an odd integer less than 2047 and n passes Miller’s test to the base 2, then n must 
be a prime integer. Using the concept of strong pseudo-prime, we are now going 
to present another yes-biased Monte Carlo algorithm for Composites, known as the 
Miller-Rabin primality test (also known as “strong pseudo-prime test”). Let us first 
describe the algorithm in Table 10.1 1, where n is an odd integer greater than 1. 

Now we shall show that the algorithm as described in Table 10.1 1 is a yes-biased 
Monte Carlo algorithm. 

Theorem 10.10.11 The Miller-Rabin primality testing algorithm is a yes-biased 
Monte Carlo algorithm. 

Proof To show that the Miller-Rabin primality testing for an odd integer n > 3 is 
a yes-biased Monte Carlo algorithm, we need to show that if the algorithm returns 
“Yes: the number is a composite integer”, then n must be composite. We shall prove 
the result by the method of contradiction. Suppose, the algorithm returns “Yes: the 
number is a composite integer” for some prime n. Since it returns “Yes: the number 
is a composite integer”, it must be the case that 

a f # 1 (modft). (10.15) 

Now we consider the sequence of values bo = a* ,b\ = b^ = a 2t , b 2 = b 2 = b^ = 
a 22f , . . . , bh-\ = 1 = a 2 ' 1 lf . As the algorithm returns “Yes: the number is a 

composite integer”, we must say that a 2 ' 1 ^ — 1 (mod ft), 0 < i < h — 1. On the other 
hand, as n is prime and 2 < a < n — 2, by Fermat’s Little Theorem 10.4.2, we have 
a n ~ l = 1 (modft), i.e., a 2ht = 1 (modft). Thus by Corollary of Theorem 10.5.1, 
a 2 * 1 lt = dzl (modft). As a 2h lt ^ — 1 (modft), we have a 2h lt = 1 (modft). Then 
a v 1 must be a square root of 1. Thus continuing similar argument, we have a 1 = 
1 (modft), contradicting (10.15), proving that the Miller-Rabin primality testing 
algorithm is a yes-biased Monte Carlo algorithm. □ 
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To obtain an error bound of the Miller-Rabin algorithm for a given odd integer 
n > 3, we need to consider the set of A -liars for n. But unfortunately, unlike the 
set of all F -liars for n , the set of all A -liars for n does not have a good algebraic 
structure like group structure. To explain that let us revisit the Table 10.9. Note 
that 7 and 32 are A-liars for 325 while their product 224 is an A-witness for 325. 
Thus to obtain an error bound, we consider the set MR(n ), for a given odd positive 
integers, defined by MR(n) = {a e Z n \ { (0) } : a n ~ l = (1) and for j = 0, 1, . . . , h — 
1, a t2j+l = (1) implies a t2j = =t(l)}, where n = 2 h t + 1, h > 1 and gcd(2, t) = 1. 
We shall prove that the set of all A-liars is contained in MR(n). Further we shall 
prove the main result that if n is a composite integer, then | MR(n) | < . To prove 

the main result, let us first prove the following lemmas. 

Lemma 10.10.8 Ifn passes Miller’s test for a base a , i.e., if a is an A-liar for n , 
then a e MR{n). 

Proof The lemma follows from Theorem 10.10.9, Table 10.10 and the Definition of 
MR(n). □ 

Lemma 10.10.9 Let f : Z* — > Z* be a group homomorphism defined by f(x) = 
x n ~ l , Vv e Z*. Then | ker f\ = gcd(0(ft), n — 1), where n = p e , for some odd prime 
p with e > 1 . 

Proof As p is an odd prime and e > 1, by Theorem 10.4.6, Z* is a cyclic group of 
order 0(ft). So, the group (Z*, .) is isomorphic to the group (Z^), +). Thus the 
homomorphism x \-^ x n ~ l in Z* induces a homomorphism g : Z^) — > Z^( w ) in 
Z(p( n ) defined by g(v) — (n — l)x, Vv e Z^). Hence | ker f\ = | kerg|. Now we are 
going to calculate | kerg|. Note that kerg = {x e Z^( n ) : (n — l)x = 0 (mod 0 (ft))}. 
Let d = gcd(0(ft), ft — 1). Thus y e kerg if and only if (ft — l)y = 0 (mod 0 (ft)) 
if and only if 0(ft) divides (ft — 1 )y if and only if divides ((ft — 1 )/d)y if and 
only if divides y if and only if y = 0 (mod pp). Thus kernel of g contains 
precisely the d residue classes, namely, (^"!) ? where i = 0, 1, . . . , d — 1, with the 
result | ker / 1 = | ker g | = gcd(0 (ft) , ft — 1) . □ 

Lemma 10.10.10 Suppose that a is an element of an additive abelian group. 
Suppose further that for some prime p and an integer e > 1, p e a = (0) and 
p e ~ l a ^ (0). Then the order of the element a is p e . 

Proof Let m be the order of the element a. Then m divides p e . As p is a prime 
integer, m must be of the form p l for some i £ {0, 1,2,..., e}. We claim that i = e. 
If i < e , then ma = p l a = (0) which implies that p e l a = (0), contradicting the 
fact that p e ~ l a / (0). Hence i must be equal to e , with the result m = p e . □ 

Now we are in a position to prove the main theorem. But before proving that 
let us first discuss some notions of random variable and uniform distribution from 
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probability theory. Here, we introduce the concepts from probability theory, starting 
with the basic notions of probability distributions on finite sample spaces as defined 
below. 

Definition 10.10.8 Let £2 be a finite non-empty set. A finite probability distribution 
on £2 is a function P : Q -> [0, 1] that satisfies F(&>) = 1. The set Q is 

called the sample space of P . 

Intuitively, the elements of £2 represent the possible outcomes of a random ex- 
periment, where the probability of outcome co e £2 is P ( co ) . 

Example 10.10.6 If we think of tossing a fair coin, then setting Q = {H, T} and 
P (co) = \ for all co e £2 gives a probability distribution that naturally describes the 
possible outcomes of the experiment. 

The above example leads to a very important distribution in the theory of proba- 
bility, known as uniform distribution. 

Definition 10.10.9 Let £2 be a finite non-empty set. If P(w) = ^ for all co e £2, 
then P is called the uniform distribution on Q . 

Now let us define another important term in the theory of probability, an event. 

Definition 10.10.10 An event is a subset A of Q and the probability of A is defined 
to be 

P[A] = J2 p (*>)• 

cue A 

It is sometimes convenient to associate a real number, or other mathematical 
object, with each outcome of a random experiment. This idea has been formalized 
by the notion of a random variable as defined below. 

Definition 10.10.11 Let P be a probability distribution on a sample space Q. 
A random variable A is a function A : £2 — > S, where S is some set. We say that 
X takes values in S. In particular, if S = R, i.e., the values taken by X are real 
numbers, then we say that X is real valued. 

For s e S, “A = s” denotes the event {co e Q : A (co) = s). Thus P[ X = s] = 
^coeX- l ({s}) T’C&O. It is further possible to combine random variables to define 
new random variables. Suppose, Ai,A 2 , ...,A W are random variables, where 
A i : £2 — >► Si, i = 1 , 2, . . . , m . From these m random variables, we can get a new ran- 
dom variable (Ai , A 2 , . . . , X m ) that maps co e £2 to (X\ (co), X 2 (co), . .., X m (co)) e 
S\ x £2 x • • • x S m . More generally, if g : S\ x £2 x • • • x S m — > T is a func- 
tion, then g(( Ai, A 2 , . . . , A m )) denotes the random variable that maps co to 
g((X\(co), X 2 (co ), . . . , A m (&>))). Next we are going to define the distribution of 
a random variable A. 
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Definition 10.10.12 Let X : T2 — > S be a random variable. The variable determines 
a probability distribution P x : S — > [0, 1] on the set S', where Px is defined by 
P x (a) = P[X = a], for all a e S. P x is called the distribution of X. In particular, 
if Px is the uniform distribution on S, then we say that the random variable X is 
uniformly distributed over S. 

Uniform distributions have many very nice and simple properties. To understand 
the nature of uniform distribution, we should know some simple criteria that will 
ensure that certain random variables have uniform distributions. For that we need 
the following definition. 

Definition 10.10.13 Let T and S be finite sets. We say that a function / : S -> T 
is a regular function if every element in the image of / has the same number of 
pre-images under /. 

Based on the definition of regular function, let us prove the following theorem. 

Theorem 10.10.12 Let f : S — > T be a surjective , regular function and X be a 
random variable that is uniformly distributed over S. Then f(X) is also uniformly 
distributed over T . 

Proof As / is surjective and regular, for every x e T, the order of the set S x = 
/ -1 ({•*}) will t> e | 7 [ • Thus for each x eT, we have 

p[f(x)= x ]= J2 = J2 w 

coeX~ l ( S x ) seS x coeX- 1 ({ 5 }) 

1 \S\ 1 1 

A ^\s\ in is in 


Hence the result. □ 

Based on this theorem, we prove the following theorem, which will help us in 
proving an error bound for Miller-Rabin primality testing. 

Theorem 10.10.13 Let f : G — > G' be a group epimorphism from the abelian 
group G onto the abelian group G r and X be a random variable that is uniformly 
distributed over G. Then /(G) is also uniformly distributed over G ' . 

Proof To prove the theorem, it is sufficient to prove that / is regular. For that let 
us consider ker /. Then ker / is a subgroup of G and for every g r e G\ the set 
is a coset of ker/. The theorem follows from the fact that every coset of 
ker / has the same size. □ 

Now we are in a position to prove the main result in this section. 
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Theorem 10.10.14 Ifn is an odd prime , then MR (ft) = Z*. If n is a positive com- 
posite integer . ; then \MR(n)\ < 


Proof First let ft be an odd prime. Then by Theorem 10.4.5, (Z*, •) is a cyclic group 
of order n — 1. As MR (ft ) c Z*, we need to show that Z* c MR (ft). Let aeZ*. 
As ft is a prime integer, ft'* -1 = (1). Now consider any j = 0, 1, . . . , h — 1 such 
that a 2J+1/ = (1). Such a j exists, as for j = h — 1, = (1). Let h = <z 27 '*. Then 

£> 2 = a 2J+lf = (1). As ft is prime, then by the Corollary of Theorem 10.5.1, the only 
possible choices for b are =b(l). Thus a 2j+lf = (1) implies a lJt = ±(1), with the 
result that a e MR (ft). Hence MR (ft) = Z*. 

Now we assume that ft is composite. We divide the proof into two cases. 

Case 1: Let n = p e , where p is an odd prime and e > 1. Let us consider the map 
0 : Z* —> Z* defined by 0 (x) = x n ~ l , Vv eZ*. Then clearly 0 is a homomorphism 
and MR (ft) c ker0. Now by Lemma 10.10.9, we have | kernel = gcd(0(zz), ft — 1). 
Thus \MR(n)\ < | ker^| = gcd {p e ~\p - l),p e - 1) = p - 1 = < 

n — 1 
4 * 

Case 2: Let ft = Pi l p% 2 • • • P^ k be the prime factorization of n with k > 1. Let 

ft — 1 = t2 h , where /z > 1 and gcd(L 2) = 1. Let x/r : Z* ai x Z* a . x • • • x Z*«, ->• 

Pi Pi Pk 


Z* be the ring isomorphism as defined in the Chinese Remainder Theorem. Let 
(pip® 1 ) = ti 2 hi , where 0 is the Euler 0-function, hi > 1 and gcd(^-, 2) = 1, for i = 
1, 2, . . . , k. Let / = min {h, h\, /z 2 , . . . , /z^}. Then l > 1. Also by Theorem 10.4.6, 
each of Z*«. is a cyclic group of order ti 2 hi . 

We claim that for a e MR{n ), = (1). If l = /z, then from the definition 

of MR (ft), ft r2 = (1). So we assume that l < h. We shall prove that in this case 
also a t2 = (1). We shall prove this by the method of contradiction. If possible let 

t2 l 

a lz 

„t2J +1 


^ (1) and let j be the smallest index in the range /,/ + !, ...,/z — 1 such that 


= (1). Then by the definition of MR(zz), we have a tz = —(1). Since l < /z, 
we have l = hi for some i e {1, 2 ,...,£}. Let a = 0(fti, #2, • • • , a\f). Then we have 
= —(1). Then by Lemma 10.10.10, the order of a\ in Z* a . is 2 ;+1 . Now by 


a- 


tv 


Pi 


Lagrange’s Theorem, 2 i+1 must divide U2 hi , a contradiction as j + 1 > j > l = hi 
and gcd(2, = 1. Thus a tl1 = (1). 

From the claim in the previous paragraph and the definition of MR (ft), it fol- 

r*l— 1 

lows that for any a e MR(zz), a tz = ±(1). Now we are going to use some results 
from the probability theory. We consider an experiment in which a is chosen at 
random (i.e., with a uniform distribution) from Z*. To prove the main result, it suf- 
ficient to prove that Pr [a tl1 1 = =t(l)] < Let a = \//(ai,a 2 , . . . , af). As a is uni- 
formly distributed over Z*, each a\ e Z* a . is also uniformly distributed over Z* a . 


Pi 


Pt 1 


and the collection of all the ai ’s is a mutually independent collection of random 

variables. Let us consider the group homomorphism /j : Z*«. — > Z*«, defined by 

Pi Pi 

f l . = a t2j , for i = 1, 2, . . . , k and j = 0, 1, 2, . . . , h. Then by Lemma 10.10.9 
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we 


have |ker/j | = gcd(tf2 hi , 1 2 ; ). Also, by using first isomorphism theorem we 
— . Further, as l < h and l < hi, we have gcd (ti2 hi , t2 l ) 


ti 2 h 


have I Im(/J ) I = gcd( , 2 „,, f2J) 

divides gcd(^2^, t2 h ) and gcd(^2^', t2 l ) divides gcd (ti2 hi , t2 hi ), it follows that 
I Im(//- )| divides |Im(//)| and |Im(//)| divides llm^^l. Thus |Im is 
even and contains at least 2\\m(f l h )\ elements. As Z* a/ is cyclic and Im is 

a subgroup of Z*«. , In^/^) is also cyclic. So, in particular, Im(y^ I _ 1 ) contains a 

Pi 


unique subgroup of order 2, namely {(1), —(1)}. Thus — (1) e Now let us 

consider two events E\ and E2 as follows: 

Note that the event a t2 = =t(l) occurs if and only if either 


1. E\ \ a* 2 ' 1 = +(1) for i = 1, 2, . . . , k or 

2. E2’. at 2,1 1 = — (1) for i = 1, 2, . . . , k. 


Further note that the events E\ and E2 are disjoint, and since the values a t2t 1 are 
mutually independent, with each value aj 2,1 1 uniformly distributed over Im 
by Theorem 10.10.13, and since Im contains ±(1), we have 


Pr [a t2 ‘~ l = ±(1)] = Pr[£i] +Pr[£ 2 ] = 2 FT 

, = i I Im( //_,)! 


and since | Im f}_ x \>2 \ (Im f l h ) | , we have 


Pr[a f2 ' ’ = ±(1)1 < 2~ k+l TI l —. (10.16) 

L J “ f{|Im(/')| 

If k > 3, then from (10.16) we can directly say that Pr [a t21 1 = d=(l)] < l and in 
that case, we are done. Now let us suppose that k = 2. In this case, Theorem 10.10.7 
implies that n is not a Carmichael number, which implies that for some / = 1 , 
we must have Im (/^) / {(1)}, with the result |Im(/^)| > 2, and (10.16) finally 

implies that Pr [a 12 * 1 = ±(1)] < ^,as required. □ 


10.11 Exercise 

Exercises-I 

1. Show that the last digit in the decimal expression of F n = 2 2 ” + 1 is 7 if n > 2. 

2. Use the fact that every prime divisor of F4 = 65537 is of the form 2 6 k + 1 to 
verify that F\ is a prime integer. 

3. Estimate the number of decimal digits in the mth Fermat number F m . 

4. Show that 91 is a pseudo-prime to the base 3. 
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5. Show that every odd composite integer is a pseudo-prime to the base 1 and — 1 . 

6. Show that if p is a prime and 2^ — 1 is composite, then 2 P — 1 is a pseudo- 

prime to the base 2. 

7. Show that if n is a pseudo-prime to the bases a and b , then n is also a pseudo- 
prime to the base ab. 

8. Show that if a and n are positive integers with gcd(< 2 , n) = gcd (a — 1, n) = 1, 

then 1 + a + a 2 H b a^^~ l = 0 (modm). 

9. Show that + b^^ = 1 (mod ab), if a and b are relatively prime positive 

integers. 

10. Use Euler’s Theorem to find the last digit in the decimal representation of 
7 iooo 

1 *7 17 

1 1 . Find the last digit in the decimal representation of 17 1 ' . 

12. Prove the following: 

(a) If n is an integer greater than 2 then 0(n) is even. 

(b) 0(3 n) = 30 (n) if and only if 3 divides n. 

(c) 0(3 n) — 2 <p{n) if and only if 3 does not divide n. 

(d) If n is odd then 0(2 n) = cf>(n). 

(e) If n is even then 0(2 n) = 2 (pin). 

(f) If m divides n then 0(m) divides 4>(n). 

(g) If n is a composite positive integer and 4>(n) divides n — 1, then n is a 
square free integer and is the product of at least three distinct primes. 

13. For all positive integer n and for all positive integer a > 2, prove that n divides 
ct)(a n - 1). 

14. Which of the following congruences have solutions and how many: x 2 = 
2 (mod 11) and x 2 = —2 (mod 59). 

15. Prove that a ( n ) is an odd integer if and only if n is a perfect square or a double 
of a perfect square. 

16. Prove that there exist infinitely many primes of the form 8/c +1 . 

17. Prove that there exist infinitely many primes of the form 8/c —1 . 

18. Prove that if p and q are primes of the form 4& + 3 and if x 2 = p (mod#) has 
no solutions, then prove that x 2 = q (mod p) has two solutions. 

19. Find all solutions of x 2 = 1 (mod 15). 

20. Find all primes p such that x 2 = —1 (mod p). 

21. Write down the last two digits of 9 1500 . 

Exercises-II Identify the correct alternative(s) (there may be more than one) from 
the following list: 

1 . Which of the following numbers is the largest? 

2 3 \ 2 43 ,3 24 , 3 42 ,4 23 ,4 3 \ 


(a) 4 32 (b) 3 42 (c) 2 34 (d) 4 23 . 
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2. Suppose the sum of the seven positive numbers is 21. What is the minimum 
possible value of the average of the squares of these numbers? 

(a) 9 (b) 21 (c) 63 (d) 7. 

3. The number of elements in the set {n : 1 < n < 1000, n and 1000 are relatively 
prime} is 

(a) 300 (b) 250 (c) 100 (d) 400. 

4. The number of 4 digit numbers with no two digits common is 

(a) 5040 (b) 3024 (c) 4536 (d)4823. 

5. The unit digit of 2 100 is 

(a) 2 (b) 8 (c) 6 (d) 4. 

6. The number of multiples of 10 44 that divide 10 55 is 

(a) 121 (b) 12 (c) 11 (d) 144. 

7. The number of positive divisors of 50, 000 is 

(a) 40 (b) 30 (c) 20 (d) 50. 

8. The last digit of (38) 2011 is 

(a) 4 (b) 6 (b) 2 (d) 8. 

9. The last two digits of 7 81 are 

(a) 37 (b) 17 (c) 07 (d) 47. 

10. Given a positive integer n , let 4>(n) denote the number of integers k such that 
l <k <n and gcd (k, n) = 1. Then identify the correct statement(s): 

(a) <p(m) divides m for every positive integer m; 

(b) a divides (j)(a m — l) for all positive integers a and m such that gcd (a, m) = 1 ; 

(c) m divides 0 ( a m — l) for all positive integers a and m such that gcd (<2 , m)= 1 ; 

(d) m divides (j)(a m — l) for all positive integers a and m. 


10.12 Additional Reading 

We refer the reader to the books (Adhikari and Adhikari 2003; Burton 1989; Hoff- 
stein et al. 2008; Ireland and Rosen 1990; Jones and Jones 1998; Katz and Lindell 
2007; Koblitz 1994; Rosen 1992; Mollin 2009; Ribenboim 2004; Stinson 2002) for 
further details. 
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Chapter 11 

Algebraic Numbers 


Algebraic number theory arose through the attempts of mathematicians to prove Fer- 
mat’s Last Theorem. One of the continuous themes since early 20th century which 
motivated algebraic number theory was to establish analogy between algebraic num- 
ber fields and algebraic function fields. The study of algebraic number theory was 
initiated by many mathematicians. This list includes the names of Kronecker, Kum- 
mer, Dedekind, Dirichlet, Gauss, and many others. Gauss called algebraic number 
theory ‘Queen of Mathematics’. Andrew Wiles established Fermat’s Last Theorem 
a few years back. An algebraic number is a complex number which is algebraic 
over the field Q of rational numbers. An algebraic number field is a subfield of the 
field C of complex numbers, which is a finite field extension of the field Q and is 
obtained from Q by adjoining a finite number of algebraic elements. The concepts 
of algebraic numbers, algebraic integers, Gaussian integers, algebraic number fields 
and quadratic fields are introduced in this chapter after a short discussion on general 
properties of field extensions and finite fields. Moreover, the countability of alge- 
braic numbers, existence of transcendental numbers, impossibility of duplication of 
general cube and impossibility of trisection of a general angle by straight edge and 
compass are shown. The celebrated theorem known as Fundamental Theorem of 
Algebra is also proved in this chapter by using the tools from homotopy theory as 
discussed in Chap. 2. This theorem proves the algebraic completeness of the the 
field of complex numbers. Like the field of complex numbers, we further prove that 
the field of algebraic numbers is also algebraically closed. 


11.1 Field Extension 

We begin with a review of basic concepts of field extension followed by finite fields. 
While studying field extension, we mainly consider a pair of fields F and K such 
that F is a subfield of K. Taking F as the basic field, an extension field K of F is 
a field K which contains F as a subfield. The basic results needed for the study of 
field extensions are discussed first followed by a discussion on simple extensions. 

M.R. Adhikari, A. Adhikari, Basic Modern Algebra with Applications , 487 
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Definition 11.1.1 Let K be a field and F be a subfield of K. Then K is said to be 
a field extension of F , written as K/F . 

Example 11.1.1 (i) C is a field extension of R, denoted by C/R. 

(ii) Q(V2) = {a + b\fl : a, b e Q} is a field extension of Q, denoted by 
Q(V2)/Q. 

(iii) R is a field extension of Q, denoted by R/Q. 

(iv) Let K be a field. If char K is 0, then K contains a subfield F isomorphic 
to Q. If char^f = p > 0, for some prime p , then K contains a subfield F isomor- 
phic to Z p . Thus K can be viewed as a field extension of F , where F is a field 
isomorphic to Q or 7u p according as char^f is 0 or p. 

(v) Let F be a field and K = F(x), the field of rational functions in v over F , 
which is the quotient field of the integral domain F[x]. Then K is a field extension 
of F. 

Definition 11.1.2 Let K be a field extension of F and a \ , «2 , . . . , ot n be in K. Then 
the smallest field extension of F containing both F and {a\,a 2 , ... ,a n } is called 
the field generated by F and [a \ , 012 , • • • , oc n } and is denoted by F(a \ , «2 , . . . , ot n ). 

We now describe the elements of F (a 1 , a^, • . . , oc n ). 

Theorem 11.1.1 Let K be a field extension of F and a \ , ot 2 , . . . , a n be in K. Then 

F(a i,a 2 ,...,a n ) 

= {f(ai,a 2 ,...,a n )/g(ai,a 2 ,...,a n ) : /, g e F[x\, x 2 , . . . , x n \ and 

g(&l, OL2 » • • • 5 Oi n ) 7^ 0 } . 

Proof Let S be the set defined by S = {f(ai, ot 2 , • • • , ot n )/g(oti, (* 2 , • • • , ot n ) • 
/, g e F[x i,X 2 , . . . , x n ] and g(a 1 , « 2 , . . . , ot n ) 0}. Then S is a subfield of such 
that F c 5 1 , since ^ = a/1 g S for every a e F. Let L be a subfield of ^ con- 
taining F and {ai, 0^2 , . . . , oc n }. Then L ^ S, because L contains f(a\,a 2 , . . . , a n ) 
and also /(«i, 0^2 , . . . , Q' w )/g(o'i, 0^2 , . . . , ot n ), if g(ot\, 012 , . . . , a w ) 7^ 0. This shows 
that S' is the smallest subfield of K containing F and {oq, a2 , . . . , a w }. Hence 
T 1 (q'i, 0'2, ...,a n ) = S. □ 

Corollary ^ ^ afield extension of F and a e K. Then F(a) = {f(a)/g(a), 
where fix), g(x) € F[x] and g(a) 0} is the quotient field of F [a]. 

Definition 11.1.3 Let K be a field extension of F. The field K is said to be a 
simple extension of F iff there exists an element a in K such that K = F (a). The 
element a e K is said to be a primitive element of the extension and F (a) is said to 
be generated by F and a . 
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Example 11.1.2 The field extension Q(\/5) of Q is a simple extension with y/5 as 
its primitive element. The field Q(\/5) is generated by Q and a root y/5 of the equa- 
tion x 2 — 5 = 0 and consists of all real numbers a + b\J 5 with rational coefficients 
a and b. 

Definition 11.1.4 A field F is said to be a prime field iff F has no proper subfield. 

Example 11.1.3 (i) Q is a prime field, because Q has no proper subfield. 

(ii) Z p is a prime field for every prime integer p. 

Definition 11.1.5 Let K be a field extension of the field E . Then an element a of 
K is said to be 

(a) a root of a polynomial f(x) = ao + a\x H b a n x n in F[x] iff f(a) = ao + 

a\a H + a n a n = 0; 

(b) algebraic over F iff a is a root of some non-null polynomial f(x) in F[x]; 

(c) transcendental over F iff a is not a root of any non-null polynomial in F[x]. 

Remark Every element a of a field F is algebraic over F . 

Definition 11.1.6 Let K be a field extension of the field F. Then K is said to be an 
algebraic extension over F iff every element of K is algebraic over F . Otherwise, 
K is called a transcendental extension over F. A simple extension F(a) is said 
to be an algebraic or transcendental over F according to whether the element a is 
algebraic or transcendental over F. 

Example 11.1.4 

(a) (i) For the field extension Q(\/2) of Q, every element a = a + b\! 2 e Q(x/2) 

is algebraic over Q. 

(ii) The element i e C is algebraic over R. 

(iii) The element tt in R is transcendental over Q(V2). 

(iv) The element Ini eC is algebraic over R but transcendental over Q. 

(b) For the field extension R of Q, 

(i) e and tt are both transcendental over Q; 

For proof see [Hardy and Wright (2008, pp. 218-227)]; 

(ii) *Jtt is transcendental over Q; 

(iii) 7r 2 is transcendental over Q. 

Remark Example 11.1.4(a)(iv) shows that the two properties algebraic and tran- 
scendental vary depending on the base field. 

Theorem 11.1.2 Let F be a field and F(x) be the field of rational functions in x 
over F . Then the element x of F(x) is transcendental over F . 
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Proof F(x ) is clearly a field extension of the field F. If v is not transcendental 

over F , then there is a non-null polynomial f(y) = ao + a\y H V a n y n over F , 

of which v is a root. This shows that ^o + ^l-xH \-a n x n =0. Hence each at = 0 

implies that f(y) is a null polynomial. This gives a contradiction. □ 

Theorem 11.1.3 Let K be a field extension of F and an element a of K be al- 
gebraic over F. Then F(a) and F[x]/ (f(x)) are isomorphic for some monic irre- 
ducible polynomial fix) in F[x ], of which a is a root. 

Proof Define the mapping pi : F[x] — > F[a], fix) i-> fia). 

Then pi is an epimorphism and hence by the First Isomorphism Theorem, 
EXx]/ ker/x = F[a]. But a is algebraic over F => ker/x {0}. Again ker/x is 
an ideal of the Euclidean domain F[x]. But F[x] is a PID => ker/x is a princi- 
pal ideal of F[x]. Hence ker/x = (g(jt)) for some g(v) e F[x]. If a is the lead- 
ing coefficient of g(x ), then fix) = a~ l g(x) is a monic polynomial of F[x] and 
ker/x = (g(A)) = (fix)). Clearly, fix) is irreducible in F[x] and hence (f(x)) 
is a maximal ideal in F[x]. Consequently, F[x]/(f(x)) is a field. Thus EXa) = 
Q(F[a]) = Q(F[x]/(f(x))) = F[x]/(f(x)). Hence the theorem follows. □ 

Corollary Let K be afield extension of the field F . If an element a of K is algebraic 
over F , then there exists a unique monic irreducible polynomial in F[x] of which 
a is a root. 

Proof Using Theorem 11.1.3, there exists a monic irreducible polynomial f(x) in 
F[x] such a is a root of f(x). If g(x) is a monic irreducible polynomial in F[x] 
such that a is a root of g(x), then by the above definition of /x, g(x) e ker/x = 
(f(x)). Consequently, f(x) divides g(x) in F[x]. Suppose g(x) = q(x)f(x) for 
some q(x) e F[x]. Since g(x) is irreducible in F[x], it follows that either q(x) or 
f{x) is a unit in F[x]. As fix) is not a unit in F[x], qix) is a unit and thus qix) = c 
for some non-zero element of F. Again, since fix) and g(x) are both monic, c = 1 
and hence fix) = gix). □ 

This corollary leads to Definition 11.1.7. 

Definition 11.1.7 Let F be a field and K be a field extension of F. Then the unique 
monic irreducible polynomial fix) in F[x] having a as a root is called the minimal 
polynomial of a over F and the degree of fix) is called the degree of a over F. 

We now show that, under a suitable condition, Fia) = F [a]. 

Proposition 11.1.1 Let K be a field extension of the field F . Then Fia) = F [a] iff 
a is algebraic over F . 

Proof Suppose a is algebraic over F . Then as proved in Theorem 11.1.3, F[a] = 
FW/(f(x))> where fix) is a monic irreducible polynomial in F[x], of which a is 
a root. Since F[x]/ (fix)) is a field, F[a] is a field and hence Fia) = F[a]. 
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Conversely, suppose F(a) = F[a]. If a = 0, then a is the root of the polynomial 
x G F[x]. If a / 0, then a~ l G and hence a -1 = ao + tfia + • • • + for 
some g F, i = 0, 1, . . . , n, n e N. Consequently, 0 = — 1 + < 2 oa + < 2 ia 2 + • • • + 

=>► ckf is a root of the non-null polynomial /(x) = — 1 + aox + a\x 2 H b 

^x^ +1 G F[x] =>► a is algebraic over F . □ 

Example 11.1.5 (i) The element i of C is algebraic over R and x 2 + 1 is the minimal 
polynomial of i over R. Again x — i is the minimal polynomial of i over C. 

(ii) The element \/5 of Q(x/5) is algebraic over Q and x 2 — 5 is the minimal 
polynomial of V5 over Q. The degree of y/5 over Q is 2. 

Theorem 11.1.4 Let K be a field extension of the field F and the element a of K 
is algebraic over F . Suppose f(x) is the minimal polynomial of a over F . 

(a) If a polynomial g(x) of F[x] has a root a , then /(x) divides g(x) in F[x]. 

(b) /(x) is the monic polynomial of smallest degree in F[x ] such that a is a root of 
fix). 

Proof (a) follows from the Corollary to Theorem 11.1.3. 

(b) If /(x) is not of the smallest degree polynomial in F[x], of which a is a root, 
then there exists a monic polynomial g(x) in F[x] such that a is a root of g(x) and 
degg(x) < deg fix). This gives a contradiction. □ 

Definition 11.1.8 Let L and K be two field extensions of the same field F . An 
isomorphism x// : L — >► K of fields, whose restriction to F is the identity homomor- 
phism, is called an isomorphism of field extensions or an F -isomorphism. Two field 
extensions L and K of the same field F are said to be isomorphic field extensions 
iff there exists an F -isomorphism x/r : L — >► K. 

Proposition 11.1.2 Let L and K be two field extensions of the same field F and 
x/f : L — >► K be an F -isomorphism. Suppose /(x) G F[x] and a is a root of f(x) 
in L. If f = x/s(a) G K , then f is also a root of /(x). 

Proof If f{x) = ao + a\x + • • • + a n x n G T^x], then xfiat) = at and x/s(a) = f>. 

Again f(a) = 0 gives 0 = xfiO) = xfifia)) = x/s (ao + a\a -\ \-a n a n ) = x//(ao) + 

xf (a\)x// (a) + • • • + xls(a n )(x//(oi)) n , since x/s is a homomorphism. This shows that 
0 = ao + a\f> H b a n f > n . Hence f = x/s (a) is a root of /(x). □ 

Theorem 11.1.5 Let F be a field and L, K be two field extensions of F . If a g L 
and ft G K are both algebraic elements over F , then there is an isomorphism 
x/f : F(a) F(/3 ) of fields, such that x/s (a) = f and x/s \f = identity , iff a and ft 
have the same minimal polynomial over F . 

Proof Suppose that /(x) is the minimal polynomial of both a and over F . Then 
by Theorem 11.1.3, there exist two isomorphisms p : F[x]/ (f(x)) -> F(a) and X : 
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F[x]/(f(x)) F(/3). Hence the composite isomorphism f = lo pT l : F(a) 

F (13) shows that F (a) and F (ft) are isomorphic under \f. 

Conversely, suppose xf : F(a) — > F(j3) is an isomorphism such that \j/(a) = f3 
and xj/'lp = identity. If f(x) e F[x] is a polynomial such that f(a) = 0, then 
\J/(a) = /3 is also a root of f(x) by Proposition 11.1.2. Hence a and /3 have the 
same minimal polynomial. □ 

Example 11.1.6 If to = e 2n / 3 is a complex cube root of 1 and a is a real root of 
the polynomial f(x)=x 3 — 3 in Q[x], then /(x) is the minimal polynomial of 
both a and coa over Q. Hence by Theorem 11.1.5 there exists a Q-isomorphism 
if : Q(a) — > Q (coa), such that \[r(a) = coa. Note that the elements of Q(a) are real 
numbers, but the elements of Q(&>o0 are not. 

Let F be a field and K be a field extension of F. Then we may consider K as a 
vector space over F by using the field operations. 

Definition 11.1.9 Let F be a field and K a field extension of F. Then the dimen- 
sion of the vector space K over F is called the degree or dimension of K over F and 
is denoted by [K : F]. If [K : F] is finite, then K is called a finite extension over 
F (or K is said to be finite over F) or K/F is called a finite extension. Otherwise, 
K is said to be an infinite extension of F. 

Example 11.1.7 (i) C is a finite extension over R. This is so because {1, /} forms a 
basis of C over R and hence [C : R] = 2. 

(ii) Let K = Q ({^Jq '.q is a prime integer}) C R. Then \K : Q] is not finite. 
[Hint. As q is a prime integer, J q is not an element of Q. Let us assume 
that p\, p 2 , . . . , p n be distinct prime integers (each is different from q) such that 
^/q £ Q (^/pi, y/pz , . . . , ^/pf), which is our induction hypothesis. We claim that 
<s/q i Q (VpT, VP2’ • • • , «J~Pn, y/Pn+i), where p\, p 2 , . . . , p n , Pn + 1 are distinct 
prime integers such that each is different from q. As +Jq £ Q, the induction hypoth- 
esis is true for n = 0. If *Jq e Q (yfp\, *JP2> • • • , > /Pn, *J~Pn+ 1), then there exist 
elements x, y e Q (yfpl, • • • , A [Pn ) such that = x + y^/p n + As y = 0 con- 
tradicts the induction hypothesis, it follows that y ^ 0. If x = 0, then q = y 2 p n + 1 
gives also a contradiction, since q and p n +\ are distinct primes. Again for x ^ 0 
and y ± 0, q = x 2 + 2 xyjp^i + p„+\y 2 =>• Jp n + 1 =(q ~x 2 - p n+ \y 2 )/2xy e 
Q(V^r, ,/P 2 , • • • > v® => e Q(V® V®’ • • • > V®)- This gives a contra- 
diction of hypothesis again. Thus by the induction hypothesis, we find that for 
any positive integer r, if p\, p 2 , . . . , p r , q are distinct prime integers, then ^/q £ 
Q (VPi» VP2’ • • • ’ Consequently, we get a strictly infinite ascending chain of 

fields such that Q C Q(v / 2) C Q(v / 2, \/3) ^ . This shows that |W : Q] is not 

finite. Clearly, [R : Q] is not finite.] 

Theorem 11.1.6 Let K be a finite field extension of the field F .If [K : F ] = n, then 

(a) there exists a basis of K over F , consisting ofn elements of K and 

(b) any set of (n + l) (or more) elements of K is always linearly dependent over F . 
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Proof As K is an n -dimensional vector space over F , the theorem follows from the 
properties of finite dimensional vector spaces. □ 

Corollary 1 The degree of an algebraic element a over a field F is equal to the 
dimension of the extension field F(a ), regarded as a vector space over F , with a 
basis B = {1, a, a 2 , . . . , a n ~ 1 } if[F(a) : F]=n. 

Corollary 2 Let a and f be two algebraic elements over the same field F such that 
F (a) = F (13). Then a and f have the same degree over F . 

Theorem 11.1.7 Let K be a finite field extension of the field F and an element a 
of K be algebraic over F . If n is the degree of the minimal polynomial f(x) of a 
over F , then [F (a) : F] = n. 

Proof It is sufficient to prove that the set S = {1, a , ... , a n ~ 1 } forms a basis of 
the vector space F (a) over F. Suppose ao + a\a + • • • + a n - \a n ~ l =0, where 
each ai e F. Then a is a root of the polynomial g(x) = ao + a\x + ••• + 
a n -\x n ~ l e F[x]. Hence f(x) divides g(x) in F[x] by Theorem 11.1.4(a). But this 
is possible only when ao = a\ = • • • = a n -\ = 0, since degg(x) < deg/(x) = n. 
This concludes that S is linearly independent over F. We now claim that the set S 
generates the vector space F(a). Let h(a) e F(a). As a is algebraic over F , 
F(a) = F[a] (see Proposition 11.1.1). Hence h(a) e F[a]. Let h(x) be the poly- 
nomial in F[x] corresponding to h(a) in F[a]. Again F is a field =>► F[x] is a 
Euclidean domain =>► there exist polynomials q(x) and r(x) in F[x] such that 
h(x) = f(x)q(x) + r(x), where r(x) = 0 or degr(x) < deg/(x). Hence for h(a) = 
f(a)g(a) + r(a), either h(a) = 0 or h(a) = r(a) is a linear combination of ele- 
ments of S , since degr(x) < n. Thus S forms a basis of F(a) over K and hence 
[F(a):F] = n. □ 

Theorem 11.1.8 Let K be a finite field extension of the field F . If S = {a \ , a 2 , . . . , 
a n } C K forms a basis of the vector space K over F, then K = F (a \,ot2, • • • ,ot n )- 

Proof Clearly, F(a\, « 2 , • • • , oi n ) c K, since F(a 1 ,^ 2 , . . . , ot n ) is the smallest 
field containing F and S. For the reverse inclusion, let a e K. As S forms 
a basis of K , there exist elements a\, 02 , . . . , a n in F such that a = a\a 1 + 
a 20 L 2 + • • • + a n a n e F(oi\, a 2 , . . . , a n ). Thus K c F(c * 1 , 012 , • • • , ot n ) and hence 
K = F(a\, ci 2 , • • • , of w ). □ 

We now prove a relation between a finite field extension and an algebraic exten- 
sion. 

Theorem 11.1.9 Every finite field extension of F is an algebraic extension. 

Proof Let K be a finite field extension of the field F. Suppose [K : F] = n and a is 
a non-zero element of K. Then the (n + 1) elements 1, a, a 2 , . . . , a n of the vector 
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space K over F are linearly dependent. Hence there exist elements ao,a\, . . . ,a n 

(not all zero) in F such that ao + a\a 4 b a n a n = 0. This shows that a is a root 

of the non-null polynomial ao + a\x 4 b a n x n e F[x]. This implies a and hence 

K is algebraic over F. □ 

Corollary If K is a finite field extension of F, then every element of K is algebraic 
over F . 

Remark The converse of Theorem 11.1.9 is not true in general. For example, 
the field K defined in Example 11.1.7(ii) is an algebraic field extension of Q 
but K is not a finite field extension of Q. This is so because for a e K, there 
exist prime integers p\, p 2 , . . . , p n such that a e Q(^/pl, *Jpi, • . . , But 

Q(^/p\, J>2 , • • • , y/Pn) is a finite extension of Q implies that a is algebraic over Q. 

Thus K is algebraic over Q but [/C : Q] is not finite. The partial converse of Theo- 
rem 11.1.9 is true for the following particular case. 

Theorem 11.1.10 The field extension K (a) over K is finite iff a is algebraic 
over K. 

Proof If K(a) is a finite field extension over K , then by the Corollary to Theo- 
rem 11.1.9, it follows that a is algebraic over K. Conversely, let a be algebraic 
over K. Then K (a) is an extension over K implies that [K ( a ) : K] = n, where n is 
the degree of the minimal polynomial f(x) of a over K. This proves that K (a) is a 
finite extension over K. □ 

Corollary Every element of a simple algebraic extension F(a) is algebraic over F . 

Remark A transcendental element cannot appear in a simple algebraic extension. 

Definition 11.1.10 Let K be a field extension of the field F. A subfield L of K is 
said to be an intermediate field of K/ F iff F c L c K holds and this intermediate 
field L is said to be proper iff L ^ K and L ^ F. 

Example 11.1.8 Let K be a field extension of a field F and a be an element of K. 
Then the field F (a) generated by F and a is an intermediate field, because F C 
F(q 0 c K. 

Remark Let L be an intermediate field of K/F. Then 

(a) K is a vector space over F and L is a subspace of K ; 

(b) K/F is an algebraic extension iff K/L and L/F are both algebraic extensions. 

Theorem 11.1.11 Let K be a finite field extension of the field F and L be an 
intermediate field of K/F. Then [K : F] = [K : L][L : F]. 
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Proof Let V be a basis of the vector space K over L and U be a basis of the vector 
space L over F. Suppose [L : F] = m and [K : L] = n. Let V = [v\, i> 2 , . . . , v n } 
and U = {u\, U 2 , . . . , w m }. Construct VF = {uv \ u e U,v e V}. Then W forms a 
basis of K over F. Finally, card W = nm proves the theorem. □ 

Remark If either card V or card U is infinite, then card W is also infinite. 

Corollary 1 If K is a field extension of the field F and each of the elements 
a \ , (* 2 , . . . , oi n of K are algebraic over F , then K (a \ , a 2 , ... , a n ) I s a finite ex- 
tension of F . 

Proof The corollary follows by induction on n. □ 

Corollary 2 If V = [v \ , V 2 , . . . , v n j is a basis of a finite field extension K of L and 
U = [u\, U 2 , . . . , u m } is a basis of a finite field extension L of F , then W = {uv : 
u £ U , v £ V], consisting ofmn elements, forms a basis of K over F. 

Corollary 3 Let F be afield and K be afield extension of F such that [K : F]=n. 
If an element a of K is algebraic over F , then the degree of a over F divides n. 

Proof Consider the tower of fields: F c F(a) C K. Then n = [K : F] = [K : 
F(a)][F(a) \ F]. As a is algebraic over F , then [F(a) : F] is the degree of a 
over F . Hence the corollary follows. □ 

Corollary 4 Let F be afield and K be afield extension of F of prime degree p. If 
a is an element of K but not in F , then a has the degree p over F and K = F(a). 

Proof [K : F] = p ^ [K : F(a)][F(a) : F] = p. Again a <£ F ^ [F(a) : F] / 
1 =>► [K : F(aO] = 1 and [F(a) : F] = p. The corollary follows from Ex. 1 of 
Exercise-I. □ 

Corollary 5 Let F C L C K be a tower of fields such that K is algebraic over L 
and L is algebraic over F . Then K is algebraic over F . 

Proof It is sufficient to show that every element a e K is algebraic over F . 
By hypothesis, K is algebraic over L =>► 3ao, a \, . . . , a n -\ (not all zero) e L 
such that a n + a n - \a n ~ l + • • • + a\a + ao = 0 => a is algebraic over the field 
F(ao, a\, . a n -\). Consider the tower of fields: F C F(af) C F(ao, a\) C • • • C 
F(ao, a \ , . . . , a n -\) C F(ao, a\, , a n - 1 , a). Clearly, each extension of the above 
tower is finite. Hence by Theorem 11.1.11, the degree of F(ao, a \, . . . , a n -\, a) 
over F is finite. Then it follows that a is algebraic over F. Hence K is algebraic 
over F. □ 


The structure of a simple transcendental extension is given in the next theorem. 
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Theorem 11.1.12 If a is transcendental over a field F , then the field F(a) gen- 
erated by F and a is isomorphic to the field F(x) of all rational functions in x 
over F. 

Proof The extension F(a) is given by F(a) = {f(oi)/g(a) : /(x),g(x) e F[x] 
and g(a) 0}, using Corollary to Theorem 11.1.1. If two polynomial expressions 
fi(a) and / 2 (a) are equal in F(a ), their coefficients must be equal term by term, 
otherwise, the difference f\(a) — fiia) will give a polynomial equation in a with 
coefficients, not all zero. This contradicts our assumption that a is transcendental 
over F. This shows that the map ^ : F[a] -> F[x ], f(a) i-> /(x) is a bijection 
and hence it is an isomorphism under usual operations of polynomials. This isomor- 
phism gives rise to an isomorphism F(a) -> F(x ), f(a)/g(a) f(x)/g(x). □ 


11.1.1 Exercises 

Exercises-I 

1. Let K be a field extension of the field F. Prove that [K : F] = 1 iff K = F. 

[Hint. [K : F] = 1 => K is a vector space over F of dimension 1 with {1} 
as a basis. Now x e K =>> x = a.l for some a e F =>► K c F . Then F ^ K 
and K c F =>- ^ = F . Conversely, if ^ = T 7 , then {1} is a basis of the vector 
space K over F. Hence [K : F] = 1.] 

2. Let be a field extension of the field F such that [K : F] is finite. If f(x) is an 
irreducible polynomial in F[x] such that f (a) = 0 for some a e K, then prove 
that deg/(x) divides [K : F]. 

[Hint. Suppose deg/(x) = n. Then [F(a) : F] = n =>► [K : F] = [K : 
F(a)][F(a) :F] = [K: F(a)] • n.\ 

3. Find [Q(^5) : Q], 

[Hint. 4^5 has the minimal polynomial x 3 — 5 over Q. Hence [Q(^) : 

Q] = 3.] 

4. Let K be a field extension of the field F and a,b e K be algebraic over F. 
If a has degree m over F and b / 0 has degree n over F , then show that the 
elements a + b, ab , <2 — b and aZ? -1 are algebraic over F and each has at most 
degree mn over F . 

5. Let K be a field extension of F such that [K : F] = p, where p is a prime 
integer. Then show that K/F has no proper intermediate field. 

[Hint. Suppose K/F has an intermediate field L . Then [K : F] = [K : L] [L : 
F] => [L : T 7 ] divides => either [L : i 7 ] = 1 or [L : F] = /? =>► either L = F 
or L = K.] 

6. Let /sf be a field extension of F and L be the set of all elements of K which are 
algebraic over F. Then show that L is an intermediate field of K/F . 

7. Let K be a field extension of F and L be an intermediate field of K/F . Show 
that L is an algebraic extension iff both K/F and L/F are algebraic extensions. 
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8. Let K be a field extension of the field F and R be a ring such that F c R c K . 
If every element of R is algebraic over F , then prove that R is a field. 

[Hint. Let a be a non-zero element of 7? and f(x) = x n -\ b aix + ao be 

its minimal polynomial over F. Then a n + • • • + a\a + ao = 0. If ao = 0, we 
have a contradiction, since the minimal polynomial of a over F has degree n. 
Again if ao ^ 0, then a~ l exists.] 

9. Show that Q(v / 5,-V5) = Q(v / 5). 

[Hint. Q(a/5, —a/ 5) is the smallest field containing a/5, —a/ 5 and Q. Again 
a/5 g QiV5) =► -a/5 g Q(a/ 5). Moreover, Q c Q(a/5),Q(V5, -a/5) c 
Q(V 5) and V5 g Q(V5, -a/5) =* Q(a/ 5) is the smallest field containing 
Q, V5 and - a/5.] 

10. Show that x 2 — 5 is irreducible over Q(a/2). 

[Hmf. Suppose x 2 — 5 is not irreducible over Q(V2). Then x 2 — 5 = 
(x — a)(x — b) for some a,b e Q(a/2). Hence a + b = 0 and ab = — 5 
V5 G Q(a/2). This is a contradiction.] 

11. Let K be finite extension of the field F. If a, G K are algebraic over F , then 
show that a + otfi and a -1 (a / 0) are also algebraic over F. 

[Hint. Clearly, F (a, /3) is a finite extension of F shows that F (a, /3) is an 
algebraic extension of F.] 

12. (a) If a is transcendental over a field F , then show that 

(i) the map fi : F[x] — >► F[a], /(A) (-> /(«) is an isomorphism; 

(ii) TXa) is isomorphic to the field F(x) of rational functions over F in the 
indeterminate x . 

(b) Show that the field Q(7r) is isomorphic to the field Q(x) of rational functions 
over Q. 

[Hint. See Theorem 11.1.12.] 

13. For the field extension R of Q, show that tt is transcendental over Q(V2). 

14. Let K be a field extension of F and L be also a field extension of F. If a G K 
and P G L are both algebraic over F , then show that there is an isomorphism 
of fields: /x : F (a) — >► F (ft), which is the identity on the subfield F and which 
maps a \-^ /3 if f the monic irreducible polynomials for a and ft over F are the 
same. 

15. If f{x) is an irreducible polynomial of degree n over a field F , then show 
that there is an extension K of F such that [K : F] = n and fix) has a root 
in K. 

16. Let K be a field extension of the field F and cy, /3 be two algebraic elements 
of K over F having degrees m and n respectively. If gcd(m, n) = 1, show that 
[F(a, ft) : F] = mn. 

17. Show that every irreducible polynomial in R[x] has degree 1 or 2. 

[Hint. Let fix) be an irreducible polynomial in R[x]. Then fix) has a 
root in C by s of Algebra. Since [C : R] = 2, the degree of a over R di- 
vides 2.] 
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11.2 Finite Fields 

Given a prime integer p and a positive integer n , we have shown in Chap. 6 the 
existence of a finite field with p n elements. 

Finite fields form an important class of fields. A field having finitely many ele- 
ments is called a finite field. If F is a finite field, the kernel of the homomorphism 
: Z — i — /i 1 7 ^ is a prime ideal. Then ker x)/ ^ {0}, because F is finite but Z 
is infinite. Thus ker^ = (p), generated by a prime integer p and ^r(Z) is isomor- 
phic to the quotient field Z /(p) = Z p . So F contains a subfield K isomorphic to the 
prime field Z p . Hence F can be considered as an extension of Z p . 

In this section we study finite fields. Throughout the section p denotes a prime 
integer. 

Definition 11.2.1 A field having finitely many elements is called a finite field or 
Galois field. 

Let K be a finite field. If char K = p, then by Example ll.l.l(iv), K may be 
considered as a finite dimensional vector space over Z p . Then the dimension of K 
over Z p is finite and it is denoted by [K : Z p \. 

Theorem 11.2.1 Let K be a finite field of characteristic p and [K : Z p ] = n. Then 
K contains exactly p n elements. 

Proof Let B = {x \ , X 2 , . . . , x n } be a basis of K over Z p and x e K. Then x can be 

expressed as x = a\x\ + 02 x 2 H b a n x n , where <Zj e Z p . As Z p has p elements, 

K has at most p n elements. Since B is a basis of K , the elements a\x\ + 02 x 2 + 

Ya n x n are all distinct for every distinct choice of elements a \ , 02 , . . . , a n of Z p . 

Thus K has exactly p n elements. □ 

Theorem 11.2.2 Let K be a finite field of characteristic p. Then every element of 
K is a root of the polynomial x p — x over Z p . 

Proof Let [K : Z p \ = n. Then K is a finite field of p n elements. Hence the multi- 
plicative group K \ {0} is of order p n — 1. If y is a non-zero element of K , then 
yP ~ l = 1 and hence y p = y. Moreover, 0 P =0. Thus every element of K is a 
root of the polynomial x p — x over Z p . □ 

To describe finite fields, we need the concept of splitting fields. 

Definition 11.2.2 Let F be a field. A polynomial /(x) in F[x] is said to split over a 
field K containing F iff /(x) can be factored as a product of linear factors in K[x]. 
A field K containing F is said to be a splitting field for /(x) over F iff /(x) splits 
over K and there is no proper intermediate field L of K/F (i.e., if L is a subfield 
of K such that F c L C K , then either L = F or L = K). 
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Remark 1 Let F be a field and fix) be a polynomial in F[x] of positive degree n. 
Then an extension K of F is a splitting field of fix) such that 

(i) f{x) factors into linear factors in K[x ] as fix) = aix — a\)ix — af) • • • (x—a n ), 
with at e K and a e F; 

(ii) K is generated by F and the roots of fix), i.e., K = Fia \ , ot 2 , . . . , a n ). 

The second condition shows that K is the smallest extension of F which contains 
all the roots of fix). 

Remark 2 If F is a field, then every polynomial fix) in F[x] of positive degree 
has a splitting field over F. 

Remark 2 Let F be a field and fix) e F[x] is of positive degree. Then any two 
splitting fields of fix) are isomorphic and hence the splitting field of fix) is unique 
up to isomorphism. 

Example 11.2.1 (i) C is the splitting field of the polynomial x 2 + 1 over R but R is 
not splitting field of x 2 + 1 over Q. 

(ii) Let K be a finite field of characteristic p. Then K is the splitting field of 
x p — x over Z p . This is so because if S is the splitting field of x p — x over Z p , 
then S contains all the roots of the polynomial x pU — x over Z p . Hence S c K. 
Again by Theorem 11.2.2, K c S. Consequently K = S. 


11.2.1 Exercises 

Exercises-II 

1. Show that any two finite fields with same number of elements are isomorphic. 

[Hint. Let K and S be two finite fields containing p n elements, where p is 
prime and n is a positive integer. Then K and S are both the splitting fields of 
the same polynomial x p — x over Z p and hence they are isomorphic.] 

2. Corresponding to a given prime integer p, and a positive integer n, show that 
there exists a finite field consisting of p n = q roots of x q — x over Z p , which is 
determined uniquely up to an isomorphism and is denoted by F p n . This field is 
sometimes called the Galois field GFip n ). 

3. Prove that F p n is a subfield of F p m iff n is a divisor of m. 

[Hint. Let H be a subfield of F p m. Then H is a finite extension of Z p 
and H = F p n, where n = [H : Z p \. Thus m = [F p m : Z p ] = [F p m : H][H : 
Z p ]. Conversely, if n is a divisor of m, then (p n — 1)| ip m — 1) and hence 
ix pU — x)\ ix pm — x). Consequently, all the roots of x pU — x, over Z p , are con- 
tained among the roots of x pm — x over Z p .\ 
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4. Let F be a field and G be a finite multiplicative subgroup of F* = F \ {0}. Then 
show that G is cyclic and G consists of all the nth roots of unity in F , where 
\G\ = n. (A generator of G is called a primitive element for F.) 

[Hint. Use the Structure Theorem for abelian groups or see Chap. 10.] 

5. Let F be a finite field. For the finite extension F (a, b) of F with a , b algebraic 
over F , show that there exists an element c in F (a, b) such that F (a, b) = F(c), 
i.e., F (a, b) is a simple extension of F. 

6. Let F be a field of p n elements, where p is prime and n is a positive integer. 
Then show that [F : Z p \ = n. 

[Hint. Suppose [F : Z p \ = m. Then by Theorem 11.2.1, F has exactly p m 
elements. Hence p m = p n =>► m = n.\ 

7. Let F be a finite field of characteristic p. Then prove that 

(a) if c is a primitive element for F , then c p is also; 

(b) if n = [F : Z p \, then there exists an element c in F such that c is algebraic 
over Z p , of degree n and F = Z p (c) . 

8. Every finite field F of characteristic p has an automorphism xjs : F — > F, 
x \-^ x p . Find the order of \[s in the automorphism group of F. If F is not fi- 
nite, examine the validity of the above result. 

[Hint. For the field F = Z p (x), the field of rational functions, the map ^ : 
F — > F, x i-> x p is not an automorphism.] 

9. Let F be a finite field and K a finite field extension of F. Then prove that the 
number of elements of K is some power of the number of elements of F. 

[Hint. If charF = p > 0, then F has p m = q (say) elements. Again if 
[K : F] = n, then K has q n elements.] 


11.3 Algebraic Numbers 

An algebraic number is a complex number which is algebraic over the field Q of 
rational numbers. In this section we shall discuss the basic properties of algebraic 
numbers. The theory of algebraic numbers was bom and developed through the 
attempts to prove Fermat’s Last Theorem. In this section we study the properties of 
algebraic numbers, such as the set A of algebraic numbers is countable and forms 
a field, called the algebraic number field which is algebraically closed. We prove 
the fundamental theorem of algebra and show the existence of real transcendental 
numbers. 

Definition 11.3.1 A complex number a which is algebraic over Q (i.e., a is a root 
of some non-null polynomial over Q) is called an algebraic number. Otherwise, it is 
said to be transcendental. 

Example 11.3.1 (i) Every rational number q is an algebraic number. This is so 

because q satisfies the equation x — q — 0 over Q. 

(ii) 1/V5 is an algebraic number. This is so because it satisfies the equation 
x^ — 1/5 = 0 over Q. 
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(iii) e and n are transcendental numbers [see Hardy and Wright (2008, pp. 218- 
227)]. 

Theorem 11.3.1 An algebraic number a, satisfies a unique monic irreducible poly- 
nomial equation f(x) = 0 over Q. Moreover ; every polynomial equation g(x) = 0 
over Q satisfied by a is divisible by f(x) in Q[x]. 

Proof Let S be the set of all polynomial equations over Q satisfied by the given 
algebraic number a and h(x) = 0 be one of the lowest degree polynomial equations 
in S. If the leading coefficient of h(x) is q, define f(x) by f(x) = q~ l h(x). Hence 
/( a) = 0 and f(x) is monic. We claim that f(x) is irreducible in Q[x]. Suppose 
f(x) = f\ (x)f 2 (x) in Q[x]. Then at least one of f\{ot) = 0 and / 2 (a) = 0 holds. 
This implies a contradiction, because f(x) = 0 is a polynomial equation of lowest 
degree in S having a as a root. Again, since Q is a field, Q[x] is Euclidean domain. 
Hence for the polynomials f(x) and g(x ), there exist polynomials q(x) and r(x) 
in Q[x] such that g(x) = f(x)q(x) + r(x), where r(x) = 0 or degr(x) < deg/(x). 
Then r(x) must be identically zero, otherwise, the degree of r(x) would be less 
than the degree of f(x) and a would be a root of the polynomial equation r(x) = 0. 
Consequently, f(x) divides g(x) in Q[x]. To prove the uniqueness of /(x), suppose 
m{x) is an irreducible monic polynomial over Q such that m(a) =0. Then f(x) 
divides m(x). Hence there exists some polynomial n(x) in Q[x] such that m(x) = 
f(x)n(x). Since m{x) is irreducible in Q[x], n(x) must be constant in Q[x]. Again, 
since f(x) and m(x) are both monic polynomials, it follows that n(x) = 1. This 
proves the uniqueness of f(x). □ 

This theorem leads to the following definition. 

Definition 11.3.2 Let a be an algebraic number. Then the irreducible monic poly- 
nomial f(x) over Q having a as a root is called the minimal polynomial of a and 
the degree of f(x) is called the degree of a. 

Example 11.3.2 V3 is an algebraic number and f(x)=x 2 — 3 is the minimal 
polynomial of V3 over Q. The field extension Q(V3) of Q is of degree 2 with a 
basis B = {1, V5}. 

Theorem 11.3.2 Let F be a finite field extension of Q of degree n. Then every 
element a of F is an algebraic number and is a root of an irreducible polynomial 
over Q, of degree <n. 

Proof Clearly, the set S = {1, a, a 2 , . . . , ,a n ], consisting of n + 1 elements of 

the ft -dimensional vector space F over Q, is linearly dependent over Q. This implies 
that a is algebraic over Q and hence a is an algebraic number. □ 

Corollary Every element of a simple algebraic extension Q(a) is an algebraic 
number. 
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Definition 11.3.3 A field F is said to be algebraically closed (or complete) iff every 
polynomial in F[x], with degree >1, has a root in F. 

Two important algebraically closed fields, such as the field C of complex num- 
bers and the field of algebraic numbers are discussed here. 

First we show that C is algebraically closed by the Fundamental Theorem of Al- 
gebra. There are several proofs of the Fundamental Theorem of Algebra. We present 
one of them by using the concept of homotopy as discussed in Sect. 2.10 of Chap. 2. 

Theorem 11.3.3 (Fundamental Theorem of Algebra) Every non-constant polyno- 
mial with complex coefficients has a complex root. 

Proof To prove the theorem it is sufficient to prove that a non-constant polynomial 

/ (z) = z n + a n —\z n * + • • • + a\z + ao, a,' g C (11.1) 

has a root in C. If ao = 0, then 0 is a root of /(z). So we assume that ao A 0. 
We may consider / as a continuous function / : C — ► C, defined by (11.1). Sup- 
pose f(z) has no root in C. Then f(z) is never zero. Consider the unit circle 
S 1 = {z G C : \z\ = 1} in the complex plane. Define a continuous family of maps 
ft : S l —> 5 1 ,zh> f(z)/\f(tz)\ for non-negative real numbers t. Then any two 
maps of this family are homotopic. For t = 0, /o is a constant map. On the other 
hand, for sufficiently large t, f t is homotopic to g. where g(z) = z n , because z n is 
dominant in the expression of f(z) for sufficiently large t. Hence /o cannot be ho- 
motopic to g, because their degrees are different. This contradiction concludes that 
f(z) has a root in C. □ 

Corollary 1 The field C of complex numbers is algebraically closed. 

Remark This corollary proves the algebraic completeness of the field of complex 
numbers. 

Corollary 2 The field R of real numbers is embedded in the algebraically closed 
field C of complex numbers. 

We now show that the set of algebraic numbers forms an algebraically closed 
field. 

Theorem 11.3.4 (a) The set A of algebraic numbers is afield , called the algebraic 
number field. 

(b) The field A is algebraically closed. 

Proof (a) It is sufficient to show that the sum, product, difference and quotient of 
any two elements a and f A 0 in A are also in A. Clearly, a + ft, a — /3, a/3 and 
a/3~ l are in the subfield Q(a, ft) of C, which is generated by Q and two elements 


11.3 Algebraic Numbers 


503 


a and ft of A. Again a is algebraic over Q =>► Q(a) is a finite extension of Q 
and similarly ft is algebraic over Q(a) =>► Q(a, ft) is a finite extension over Q(a). 
Hence Q(a, ft) is a finite extension of Q =>- each element of Q((z, /?) is an algebraic 
number. This shows that A is a field. 

(b) Let f{x) = x n + a n -\x n ~ l H b ao be a polynomial in A[x]. Then the co- 

efficients ao, a\, . . . , a n -\ generate an extension F = Q(ao, a\, . . . , a n -\). Clearly, 
F is a finite extension of the field Q. Any complex root a of f(x) is algebraic over 
the field F and hence F(a) is a finite extension of F. Consequently, F(a) is also a 
finite extension of Q. Thus the element a of this extension is an algebraic number, 
in the field A. This leads us to conclude that the field A is algebraically closed. □ 

Corollary The field Q of rational numbers is embedded in the field A of alge- 
braically closed field A. 

We now show the existence of transcendental numbers. It does not follow im- 
mediately that there are transcendental numbers, though almost all real numbers are 
transcendental. So we need an alternative definition of an algebraic number. 

An algebraic number is a complex number a which satisfies an algebraic equa- 
tion of the form a§x n + a\x n ~ l H b a n = 0, where ao, a \ , . . . , a n are all integers, 

not all zero. A number which is not an algebraic number is called a transcendental 
number. 

Theorem 11.3.5 The set of all algebraic numbers is countable. 

Proof To prove this theorem we define the rank r of an equation aox n + a”~ l + 
• • • + a n = 0 , at € Z and ao ^ 0 as r = n + |<zol + \a\ \ + • • • + \a n |. The minimum 
value of rank r is 2 . As there exist only a finite number of such equations of rank r, 
we may write them as: 

Ar, l, A r? 2, • • • , A r ,m r • 

Arranging the equations in the sequence: 

A-2,1 > A.2,2> • • • , A2,r 2 , ^34^3,2, • • • , ^3^3, A44, 

we can assign to them the integers 1 , 2 , 3 , This shows that the aggregate of these 

equations is countable. Clearly, every algebraic number corresponds to at least one 
of these equations, and the number of algebraic numbers corresponding to at least 
one of these equations is finite. Hence the theorem follows. □ 

Corollary The set of all real algebraic numbers is countable and has measure zero. 

Existence of Transcendental Numbers The set of all real numbers is not count- 
able. On the other hand, the set of all real algebraic numbers is countable. Hence 
there exist real numbers which are not algebraic. This shows the existence of real 
transcendental numbers. This leads us to conclude the following. 


Theorem 11.3.6 Almost all real numbers are transcendental. 
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11.4 Exercises 

Exercises-III 

1. Show that the degree of an algebraic number a is equal to the dimension n = 
[Q(a) : Q], with a basis B = {1, a, ... , a n ~ 1 }. (Here Q(a) is regarded as a vector 
space over Q.) 

2. If two algebraic numbers a and ft generate the same extension fields Q(a) and 
Q(/3), i.e., Q(c0 = Q(/0> then show that a and ft have the same degree. 

3. Let F be a field. Show that the following statements are equivalent: 

(a) F is algebraically closed; 

(b) every irreducible polynomial in F[x] is of degree 1; 

(c) if a polynomial fix) e F[x] is of degree >1, then fix) can be expressed as 
a product of linear factors in F[x ] ; 

(d) if K is an algebraic field extension of F , then F = K. 

4. Let F be a finite field. Then show that F cannot be algebraically closed. 

We now introduce the concept of algebraic integers. 


11.5 Algebraic Integers 

The concept of algebraic integers is one of the most important discoveries in number 
theory. The ring of algebraic integers shares some properties of the ring of integers 
but it differs as regards many other properties. 

Definition 11.5.1 A complex number a is called an algebraic integer iff a is a root 
of a monic irreducible polynomial fix) in Q[x] with integral coefficients of the 
form 

fix) = x n + a n - \x n ~ l H h ao. 

Remark 1 An algebraic number a is an algebraic integer iff a satisfies some monic 
polynomial equation fix) = 0 with integral coefficients. 

Remark 2 While defining algebraic integers, some authors do not insert the condi- 
tion of irreducibility of fix) in Q[x], because of Theorem 11.5.2. The irreducible 
equation satisfied by a rational number p/q is just the linear equation x — p/q = 0. 
Thus a rational number is an algebraic integer iff it is an integer in the ordinary 
sense. Such an integer in Z is called a rational integer to distinguish it from other 
algebraic integers. 

Theorem 11.5.1 Every algebraic number is of the form a//3 , where a, is an alge- 
braic integer and p is a non-zero integer in Z. 
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Proof Let y be an algebraic number. Then y is a root of a monic irreducible poly- 
nomial f{x) = x n + a n - \x n ~ l + • • • + a\x + ao e Q[v]. If f> is the 1cm of the 
denominators of a§,a \, . . . , then f is a positive integer and each ft cii is an 
integer, for i = 1, 2, . . . , n — 1. Then fty is a root of a monic polynomial in Z[jc]. 
Hence f>y is an algebraic integer, a (say). Thus y = a //?, where a is an algebraic 
integer and ft is a non-zero integer. □ 

Definition 11.5.2 An algebraic element a ^ 0 is called a unit iff both a and a~ l 
are algebraic integers. 

Example 11.5.1 (i) y/3 is an algebraic integer, because it satisfies the equation 

x 2 — 3 = 0 over Z. 

(ii) (1 + a/ 13) /2 is an algebraic integer, because it satisfies the equation v 2 — v — 
3 = 0 over Z. 

(iii) © = i(— 1 + V30 is an algebraic integer, because it satisfies the equation 
x 2 + x + 1 =0 over Z. In general, every root of unity is an algebraic integer, because 
it satisfies the monic polynomial equation x n — 1 = 0 over Z for some positive 
integer n. 

(iv) 0, ±1, ±2, . . . are the only algebraic integers in Q (referred to as “rational 
integers”). 

[Hint. Every integer n is an algebraic integer, because n is a root of (x — n) e 
Z[x\. On the other hand, suppose a rational number m/q with gcd(m, q) = 1 is an 

algebraic integer. Then there exists a polynomial f(x) = x n +a n - \x n ~ l H b^o £ 

Q[x] such that f(m/q) = 0 Consequently, m n + a n -\qm n ~ l + • • • + aoq n = 0 =>► 
q\m n =>► q = ±1, since gcd(m, q) = 1 =>► m/q is an integer.] 

(v) 1 /a/ 2 is an algebraic number but it is not an algebraic integer. 

[Hint. As 1 /a/ 2 satisfies the equation x 2 — 1/2 over Q. So it is an alge- 
braic number. If possible, let 1/V2 be an algebraic integer. Then it satisfies a 
monic polynomial f(x) over Z such that /( l/\/2) = (l/\/2) n + a n -\{\ / \/2) n ~ l + 
a n —2i^/ \/2) n 2 + • • • + ao = 0. Then (1 + 2a n —2 + 4a n —4. + • • • ) + \/2(^_i + 
2a^—2, H“’’’) = 0. If # — 1 T - 2ayi~2 4^^_4 -!-•••, and b — a^—i T 2^^_3 
then ^ + Z?>/2 = 0. Again if b ± 0, then = —a/b is the quotient of two integers. 
This gives a contradiction. Hence b = 0 and thus a + by/2 = 0 gives a = 0. This 
also gives a contradiction, since a is odd.] 

Remark In examining whether a given algebraic number is an algebraic integer, it 
is not necessary to appeal to an irreducible polynomial equation. This follows from 
Theorem 11.5.2. 

Theorem 11.5.2 A complex number a is an algebraic integer iff it satisfies over Q 
a monic polynomial equation with integral coefficients. 

Proof If a is an algebraic integer, then clearly a satisfies a monic polynomial p(x) 
over Q with integral coefficients. Conversely, let a complex number a satisfy a 
monic polynomial equation over Q with integral coefficients. Then a also satisfies 
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an irreducible polynomial, say fix) over Q with integral coefficients. Note that any 
common divisor of these integral coefficients may be removed. So without loss of 
generality, we may assume that the gcd of these integral coefficient is 1, resulting 
fix) to be a primitive polynomial in Z[x]. The given polynomial p(x ) is monic 
and hence also primitive in Z[x\. Now, the polynomial p(x) is divisible in Q[x] by 
the irreducible polynomial fix). Let p(x) = f(x)gix), where g(v) e Q[x]. Since 
both p(x) and fix) are primitive polynomials in Z[x\, it follows that g(v) eZ[x]. 
Thus the leading coefficient 1 in p(x) is the product of the leading coefficients in 
fix) and gix). Hence, ±/(jc) is monic, resulting that a is an algebraic integer by 
definition. □ 


11.6 Gaussian Integers 

We recall that Z[i] = {a + bi : a, b e Z} forms a ring, called Gaussian ring, named 
after C.F. Gauss. The elements of Z [/], called Gaussian integers are the points of 
a square lattice in the complex plane. In this section we define Gaussian integers 
and Gaussian prime integers. Moreover, we study their properties with an eye to the 
corresponding properties of integers. 

Definition 11.6.1 A complex number of the form a + bi, a,b eZ (i.e., an element 
of Z [/]), is called a Gaussian integer. 


The set of all Gaussian integers is a subring of C. C.F. Gauss developed the 
properties of Gaussian integers in his work on biquadratic reciprocity. Z[i] is an 
Euclidean domain with norm function defined by N (a) = aa = m 2 + n 2 , for a = 
m + ni eZ [/]. Clearly, N (a) has the following properties in Z [/]: 


(a) N(ap) = Nia)NiP) for all a, /3 e Z [/] \ {0}; 

(b) N (a) = 1 iff a is a unit; 

(c) 1,-1,/ and —i are the only units in Z [/]; 

1 if a is a unit 
>1 if a £ {0, 1, — 1, /, — /}. 


(d) N(a ) : 


Definition 11.6.2 A prime element in the Gaussian ring Z[i] is called a Gaussian 
prime integer. 


Remark Every Gaussian prime is non-zero and non-unit. 


Example 11.6.1 (i) 3 is a Gaussian prime integer. 

(ii) 5 is not a Gaussian prime integer. 

[Hint. As there are four units in Z [/], there are four factorizations of the integer 5 
in Z [/], which are considered equivalent: 5 = (2 + /)(2 — /). This is the prime fac- 
torization of 5 in Z [/]. Every non-zero element a e Z\i] has four associates, viz., 
zb a, ±ia. For example, the associates of 2 + i are 2 + /, — 2 — /, — 1 + 2/, 1 — 2i.\ 



11.6 Gaussian Integers 


507 


Example 11.6.2 3 + 2/ is an algebraic integer, because it is a root of x 2 — 6x + 
13 = 0. 

Proposition 11.6.1 For an element a in Z [/], if N (a) = p, a prime integer ; then a 
is prime in Z[/]. 

Proof Clearly, a is a non-zero non-unit, since p / 0, / 1 and 1, — 1, i and — i are 

the only units in Zf/ ] . If possible, let a = /3y for some f, y e Z [/]. Then A (a) = 
A(/3)A(}/) =>• p = A(/3)A(}/) either A/^(/3) = 1 or A(y) = 1, since p is prime 
and A(/3), N(y) are both positive integers. Hence either y or is a unit in Z [/]. 
This shows that a is irreducible in Z [/]. As Z[/] is an Euclidean domain, Z [/] is 
also a principal ideal domain. Hence a is a prime element in Z[/]. □ 

Remark N {2) = 9 =>- norm of a Gaussian prime integer may not be a prime integer. 

Example 11.6.3 The Gaussian integer 1 + / is Gaussian prime, since A(1 + i) = 2 
is a prime integer. 

Remark All prime integers are not Gaussian primes. Let p be a prime integer. Then 
p is a Gaussian prime iff the ring Z [/]/ (/?) is a field. 

We now describe all Gaussian primes. 

Theorem 11.6.1 The Gaussian primes are precisely of the following types : 

(a) all prime integers of the form An + 3 /7tezr associates in Z [/]; 

(b) number 1 + i and its associates in Z [/]; 

(c) a// Gaussian integers a associated with either a A- bi or a — bi , where a > 0, 

b > 0, o /<2 6>r Z? A aftd A (a) = a 2 + b 2 is a prime integer of the form 

Am + 1. 

Proof Let a = a A- bi be a prime element in Z[/]. Then A (a) = a 2 + b 2 is an 
integer >1. Suppose A (a) = aa = p® 1 p® 2 • • • Pm m , where pi, P2, - - - , Pm are dis- 
tinct prime integers. Then a\p® 1 p® 2 • • • p^f in Z [/]. We claim that a divides only 
one of the integers pi, P2, - - - , Pm- If possible, let a divide two distinct primes 
p and q in the above expression. Now gcd (p, q) = 1 =>- 1 = px + qy for some 
v,yeZ=^a'|l=^a contradiction, since a is a prime element in Z [/]. This shows 
that a divides only one prime p (say) in Z [/]. On dividing p by 4, we have either 
p = 1 (mod 4) or p = 2 (mod 4) or /; ee 3 (mod 4). 

(a) Suppose p = 3 (mod 4). Now al/? in Z[/] =>• A(aO|A(/?) = p 2 =>• A (a) = p 
or p 2 , since A (a) > 1. If A (a) = /?, then a 2 + Z? 2 = p. Since p is an odd 
prime, either a is even and b is odd or a is odd and b is even which imply 
p = 1 (mod 4) a contradiction. Thus, A (a) = p 2 . Again a\p in Z [/] 
a/3 = p for some e Z[/] A(aOA(/0 = A(/?) p 2 N(/3) = p 2 A(/J) = 

1 =>> /3 is a unit in Z[/] a and /? are associates in Z [/]. Thus is also a prime 
element in Z[/], as required in (a). 
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(b) Suppose p = 2 (mod 4). Then p = 2 = (1 + /)(1 — /). Now = 2 =>► 1 + / 
is prime in Z[/]. Again 1 — / = — /(I + i) shows that 1 — / is an associate of 
(1 + /). Thus 1 + i and its associates are prime elements in Z [/], as required 
in (b). 

(c) Suppose p = 1 (mod4). Now a\p => N(a)\N(p) => N(a) = p or p 2 . Suppose 
N(ot) = p 2 . As p = 1 (mod 4), suppose /? = 4& + 1, k g Z. Then the quadratic 

1 \ P 1 r \ 7 

) = (— l)^ - (mod p) = (— 1 Y K = 1. Hence 

there exists an integer n such that n 2 = — 1 (mod p) => p\(n 2 + 1) in Z =>► 
p\(n 2 + 1) in Z[Z] =>► /?|(a + Z)(n — i) in Z[/]. But =>► a|(n + Z)(n — i) =>► 
a | (n — i) or a | (n — /), since a is a Gaussian prime. We claim that p is not an as- 
sociate of a. Suppose p is an associate of a. Then p \ot in Z [/] => p\(n + /) 
or p\(n — i) in Z [/]=>► either ^ or ^ — =/ is a Gaussian integer =>► 

a contradiction, since 1//? is not an integer =>► /? is not an associate of a. 
Suppose N(ot) = N(p) = p 2 . Then a\p => p = a/3 for some /3 g Z [/]. Hence 
N(p ) = A(o')A(yS) =>► AZ(/J) = 1 =>► /3 is a unit in Z[Z] p is an associate of 
a =>► a contradiction => N (a) ^ p 2 ^ N (a) = p (which is the other possibil- 
ity) =>► N(a) = p ^ a — bi is also a Gaussian prime integer. Suppose a — bi 
is an associate of a 4- bi. Then a — bi = w(a + Z?Z), where u g {1, —1, i, —i}. 
If u = 1, then b = 0. Hence a 2 = A (a) = p is not possible. Thus 1. Simi- 
larly, w / — 1, /, — i. Consequently, a + bi is not an associate of a — bi. Again 
N (a) = p =>► a 2 + b 2 = p =>► one of a and b must be even and the other must 
be odd. This proves (c). ^ 

Like division algorithm for ordinary integers and for polynomials, a division al- 
gorithm can be developed for Gaussian integers. 

Theorem 11.6.2 For given Gaussian integers a and /3 ^ 0, there exist Gaussian 
integers y and X in Z [/] such that 

a = f3y+X, where either X = 0 or N(X)<N(/3). (1L2) 

Proof Suppose a/f = r + ti and choose integers r' and t' as close as possible to the 
rational numbers r and t , respectively. Then a//3 = (r r + t'i) + [(r — r') + (t — t')i ] = 
y + /x (say), where | r — r'| < 1/2, 1 1 — t'\ < 1/2. Now a = /3y + fpi. As a, f3 and 
y g Z[/], ftp. = a — /3y e Z [/]. If r = r' and £ = t\ then /x = 0. If not, then /x / 0. 
Let /3 = a + bi e Z[i ] . Then N(/3fi) = (a 2 + Z? 2 ) [(r - r') 2 + (r - 0 2 ] < (^ 2 + b 2 ) = 
N(/3). Thus the proposition follows by assuming A = /3/x. □ 

Proposition 11.6.2 Two Gaussian integers a and /3 , both zero , Zz< 2 V£ a greatest 
common divisor 8 in Z [/], which is a Gaussian integer given by 8 = Xa-\- p,/3, where 
X and /x are Gaussian integers. 

Proof Let 7 = (a, /?) be the ideal generated by a and in the ring Z[/]. Suppose 8 
is one of the non-zero elements of J such that N (8) is of minimum non-zero norm 
and a = 8y + \j/ and ft = 8y\ + xfi as in Theorem 11.6.2. We claim that \j/ and x//i 
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are both zero. If not, then the remainders x/r and xfi are in J with N(xf) < N(8 ) 
and N(x//\) < N (8), which implies a contradiction. Hence xf = 0 = x[/\ shows that 
a = 8y and f = 8y\ . Consequently, 8 is a common divisor of a and ft. Again, since 
8 e /, it has the form 8 = Xa + /r/3, X, fi e Z [/]. Thus 5 is a multiple of every 
common divisor of a and ft. This proves that <5 is the greatest common divisor of a 
and f. □ 

The rest of the method of decomposition of Gaussian integers can be proved by 
the method of rational integers. 

Proposition 11.6.3 (a) If X is a non-zero non unit Gaussian integer ; then there 
exists a Gaussian prime p such that p divides X in Z [/]. 

(b) If X is Gaussian prime and a, f are Gaussian integers such that X\ a/3, then 
X\a or X\/3 in Z [/]. 

Proof Left as an exercise. □ 

Theorem 11.6.3 Every non-zero , non unit Gaussian integer a can be expressed 
uniquely as a product a = X 1 X 2 . . . X n of prime Gaussian integers , such that any 
other decomposition of a in Z [/] into prime Gaussian integers has the same number 
of factors and can be arranged in such a way that the corresponding placed factors 
are associates. 

Proof Left as an exercise. □ 

Theorem 11.6.4 A complex number a in the field Q (/) is a Gaussian integer iff 
the monic irreducible polynomial equation satisfied by a over Q has integral coeffi- 
cients. 

Proof Let a = a + bi be a Gaussian integer which is not a rational integer. Then 
b ± 0 and a satisfies a monic irreducible quadratic equation: x 2 — 2 ax + (a 2 + 
b 2 ) = 0 over Q with rational integers as coefficients. Conversely, let a number a = 
p + qi e Q(i) satisfy a monic irreducible polynomial f(x) in Q[x] with integral 
coefficients. Then it follows that a is a Gaussian integer. □ 


11.7 Algebraic Number Fields 

An algebraic number field is a field which is obtained from the field Q by adjoin- 
ing a finite number of algebraic numbers. In this section we shall give some basic 
properties of algebraic number fields. 

Definition 11.7.1 An algebraic number field K is a subfield of C of the form K = 
Q(aq , <22 , ... ,a n ), where a \ , < 22 , . . . , ot n are algebraic numbers. 
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Example 11.7.1 (i) Q(a/ 2, >/7, \/IT, 713) is an algebraic number field. 

(ii) Q( 75, is an algebraic number field. 

(iii) Q(7r 2 ) is not an algebraic number field. 

Proposition 11.7.1 The roots of unity which lie in any given algebraic number field 
form a cyclic group. 

Proof Left as an exercise. □ 

Remark For more results see Exercises-IV. 


11.8 Quadratic Fields 

Explicit computation with algebraic integers is difficult. Instead of developing a gen- 
eral theory, we study algebraic integers in a quadratic field. More precisely, in this 
section we describe algebraic integers in a field Q(a), where the complex number a 
is root of an irreducible quadratic polynomial in Q[x]. 

Definition 11.8.1 A subfield F of C is called a quadratic field iff there exists a 
complex number a such that F = Q(a), where a is a root of an irreducible quadratic 
polynomial in Q[x]. 

Remark Q(a) = {a + ba : a, b e Q} and [Q(a ) : Q] = 2. 

Example 11.8.1 Q(\/7) is a quadratic field. This is so because x 2 — 7 is irreducible 
in Q[x]. 

Theorem 11.8.1 Let F be a quadratic field. Then there exists a unique square free 
integer n such that F = Q(v^)- 

Proof Let F = Q(aO, where a is a root of the irreducible polynomial x 1 + bx + c 
over Q. If a\ and ai are the two values of a , then a\ = ~ — and ai = 

Hence ai +a2 = _ be q^ Q (ai ) = Q(„ 2 ) =>F = Q(a) = Q(ai) = 

Q( -fe+Vfr ~ 4c ) _ Q where a = b 2 — 4c e Q. But this a is not the square of 
a rational number, since x 2 + bx + c is irreducible in Q[x] by hypothesis. Suppose 
a = p/q , where p,q are integers, q > 0 and gcd(p, q) = 1. Let / 2 be the largest 
square dividing pq. Then pq = l 2 n , where n (^1) is a square free integer and 
F = Q (y/a) = Q Wp/q) = — QGy/n) = Q(V^). To show the uniqueness 

of ft, let m be another square free integer such that T 7 = Q(^m). Then Q(^n) = 
Q(^fm) =>► yTf = v + y>/m for some x,jgQ 4 «=i 2 + my 2 + 2 xy^fm. If 
xy 7 ^ 0, then = (n — x 2 — my 2 )/2xy. This is a contradiction, since «Jm £ Q, 
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as m is square free. Hence xy = 0. Again if y = 0, then x = *Jn. This is again a 
contradiction, since +Jn £ Q as n is square free. Consequently, x = 0 and n = my 2 . 
Again n is square free and hence y 2 — 1. Consequently, m — n. □ 

We now describe the form of the algebraic integers a in the quadratic field 
F = Q (<y/n), where n is a square free integer. 

Let a = 1+1 , where l,m,n are integers such that d > 0 and gcd(/, m,d) = 1. 
If m = 0, then a = l/d => a is a quadratic integer iff d = 1. Next suppose 
m / 0. Now cy is a root of a polynomial x 2 + bx + c e Z[x] =>- ( /+ ^^ ) 2 + 
_j_ c _ q /2 _|_ m 2^ _|_ _|_ c j2 _ q anc i 2/ + bd = 0, since m 7^ 

0. Let gcd(/, d) = t > 1. Then there exists a prime factor p of £ and hence 

p|/ and p\d => p 2 \m 2 n => p 2 \m 2 , since n is square free => p\m. This is a 
contradiction, since gcd (l,m,d) = 1 and hence t — 1. Thus gcd(/, d) = 1 =>► 

J|2, since 2/ + bd = 0 d = 1 or J = 2. If d = 1, then a = / + is 

a quadratic integer. If d = 2, then a = l+m ^ (a — (// 2)) 2 = a 

is a root of the quadratic equation x 2 — lx + = 0 over Z / 2 = 

m 2 n (mod 4), since the above coefficients are all integers. Again gcd(/, d) = 
1 =>► gcd(/,2) = 1 =>► / is odd 1 = m 2 ft (mod 4), since / 2 = m 2 /i (mod 4). 
We now consider the possible cases: n = 1 (mod 4) or /? = 2 (mod 4) or n = 
3 (mod 4). 

Case 1 If n = 1 (mod 4), then 1 = m 2 (mod 4). This is true iff m is odd. 

Case 2 If n = 2 (mod 4), then 1 = 2m 2 (mod 4). But there is no integer m such that 
1 = 2 m 2 (mod 4). 

Case 3 If n = 3 (mod 4), then 1 = 3m 2 (mod 4). But there is no integer m such that 
1 = 3 m 2 (mod 4). 

From the above discussion there follows Theorem 11.8.2. 


Theorem 11.8.2 Let A be the set of all algebraic integers in the quadratic field 
F = Q (y/n) = {a + b^Jn \ a, b e Q}, where n is a square free integer. Then the set 
A is given by 


A = 


l + m^/n, 

l+nty/n 
2 ’ 


where l, m are integers and n^l (mod 4) 
where /, m are odd integers and n = 1 (mod 4). 


11.9 Exercises 

Exercises-IV 

1. Show that a simple transcendental extension of a field F is isomorphic to the 
field of rational functions over F and is of infinite degree over F; moreover, 
any two simple transcendental extensions of F are isomorphic. 
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2. Let F (a) be a simple algebraic extension of a field F . Prove that 

(a) F (a) = F[x]/(f(x)), where f(x) is the minimal polynomial of a over F ; 

(b) If n is the degree of f(x), then S = {1, a, a 2 , ... , a n ~ 1 } forms a basis of 
F(a) over F\ 

(c) If P is an algebraic element of F(a), then the degree of ft over F is a divisor 
of degree of a . 

3. (a) Let F be a subfield of C. If a e C and ft e C are algebraic over F , show 

that there exists an element ye C such that F (a, ft) = F(y). 

(b) Let F be a subfield of C. If a\ , a 2 , . . . , a n e C are algebraic over F , show 
that there exists an element ye C such that F(a\, 012 , • • • , ot n ) = F(y). 

(c) If F is an algebraic number field, prove that there exists an algebraic inte- 
ger y such that F = Q (y). 

[Hint. Use 3(b).] 

(d) Let F = Q(y) be an algebraic number field, where y is an algebraic integer. 
If the degree of the minimal polynomial of y over Q is n, then prove that 

every element of F can be expressed uniquely in the form ao + a\ y -\ b 

a n -iy n ~ l , where a; are rational numbers. 

[Hint. F is vector space over Q of dimension n.] 

(e) Let F be an algebraic number field and K be the set of all algebraic integers. 
Then show that the set H = K n F is an integral domain (called the ring 
of integers of the algebraic number field K) and the quotient field of H 
is F . 

(f) Let F be a field of characteristic 0. 

(i) If a, P are algebraic over F , show that there exists an element y e 
F(a , P) such that F(a, P) = F(y). 

(ii) Prove that any finite field extension of the field F is a simple exten- 
sion. 

[Hint. Use induction argument to case (a).] 

4. Let F be field and K be a field extension of F of prime dimension p. If a is an 
element of K but not in F, show that a has degree p and K = F (a). 

[Hint. [K : F] = p^[K : F(ot)][F(oc) : F] = p => [K : F(ot)] = 1, since 
[F( a) : F] ± 1, as a $ F. Hence [F(a) : F] = p and K = F(a).] 

5. Let F be a subfield of the field C. If an element a of C is algebraic over F , then 
prove that 

(a) every element P of F (a) is algebraic over F\ 

(b) degree of P over F < degree of a over F . 

6. Let F C L c K be a tower of fields such that K is algebraic over L and L is 
algebraic over F . Prove that K is algebraic over F . 

7. Let F be field. Prove that F is algebraically closed iff every irreducible poly- 
nomial in F[x] is of degree 1. 

8. (a) Let F be field. Show that there exists an algebraic field extension K of F 

such that K is algebraically closed (called an algebraic closure of F). 
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(b) If K and L are two algebraic closures of F , then show that there is an 
isomorphism \j/ : K -> L such that ^r{a) = a, Wa e F. 

Remark Given a field F , its algebraic closure is unique up to isomorphisms. 

9. (a) Prove that a number a is an algebraic integer iff the additive group gen- 
erated by all the powers 1, a, a 2 , a 3 , . . . of o' can be generated by a finite 
number of elements. 

(b) If all the positive powers of an algebraic number a lie in an additive group 
generated by a finite set of numbers a\, a^, . . . , a n , then show that a is an 
algebraic integer. 

(c) Prove that the set of all algebraic integers is an integral domain. 

(d) In any field of algebraic numbers, prove that the set of algebraic integers is 
also an integral domain. 

10. Show that the minimal polynomial of an algebraic integer is monic with integral 
coefficients. 

11. (a) Let K be finite extension over F such that [K : F] = n. Prove that every 

element a of K has a degree m over F such that m is a divisor of n. 

(b) Let f{x) be an irreducible cubic polynomial over a field F . If K is an 
extension of F of degree 2 n , then show that f(x) is irreducible in K[x]. 

(c) Using (b) as the algebraic basis show that by straight edge and compass 
alone, it is impossible to 

(i) duplicate a general cube (duplication of a cube); 

(ii) trisect a general angle (trisection of an angle). 

If a real number can be constructed by straight edge and compass 
alone, then the number is said to be constructible. 

[Hint, (a) The element a generates a simple extension F(a) of F. Hence 
[K : F(a)][F(a) : F] =n. 

(b) If f(x) is a reducible polynomial over K of degree 2 n , then the cubic 
polynomial fix) must have at least one linear factor in K[x]. This shows that 
K contains a root a of fix). But such an element a of degree 3 over F cannot 
lie in a field K of degree 2 n over F by (a). 

(c) (i) If it possible to construct another cube of double the volume of a unit 
cube, then the side x of the new cube must satisfy the equation x 3 — 2 = 0. But 
by the Eisenstein criterion, the polynomial x 3 — 2 is irreducible in Q[x]. Over 
any field F corresponding to the straight edge and compass construction, the 
polynomial x 3 — 2 is also irreducible by (b). 

(ii) A 60° angle is constructible, because cos 60° is half. But if an angle 
of 60° is trisected by straight edge and compass, then cos 60° = 4 cos 3 20° — 
3 cos 20° shows that x = cos 20° must satisfy the cubic equation 8x 3 — 6x — 1 = 
0 in Q[x]. But 8x 3 — 6x — 1 is irreducible in Q[x].] 

12. An extension K of a field F is a root field of a polynomial fix) of degree >1, 
with coefficients in F iff 

(i) fix) can be factored into linear factors fix) = a (x — ot\ ) • • • (x — a n ) in 
K[x ]; and 
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(ii) K = F(a\,a 2 , ap- 
prove that 

(a) any polynomial over any field has a root field (existence). 

(b) all root fields of a given polynomial over a given field F are isomorphic 
(uniqueness). 

(c) for any prime integer p and any positive integer n , there exists a finite 
field with q = p n elements, which is the root field of x q — x over Z p . 

Exercises-V Identify the correct alternative(s) (there may be more than one) from 
the following list: 

1. Let R be the quotient ring Z[i\/nZ[i]. Then R is an integral domain if n is 

(a) 19 (b) 7 (c) 13 (d) 1. 

2. The field extension Q(a /2 + ^/2) over the field Q(V2) has degree 

(a) 1 (b) 3 (c) 2 (d) 6. 

3. If the polynomial x 4 + v + 6 has a root of multiplicity greater than 1 over a field 
of characteristic p (prime), then p is 

(a) p = 5 (b)p = 3 (c) p = 2 (d) p = 7. 

4. Let F be a field of eight elements and B be a subset of F such that B = {x e 
F : x 1 — 1 and x n ^ 1 for all positive integers n less than 7}. Then the number 
of elements in B is 

(a) 3 (b) 2 (c) 1 (d) 6. 

5. Let f(x ) = 2v 2 + v + 2 and g(v) = v 3 + 2x 2 + 1 be two polynomials over the 
field Z3 . Then 

(a) f(x) and g(x) are both irreducible; 

(b) neither f(x) nor g(v) is irreducible; 

(c) f(x) is irreducible, but g(x) is not; 

(d) g(x) is irreducible, but f(x) is not. 

6. Let F be a field and the polynomial f(x) = x 3 — 3l23l2x + 123123 be irre- 
ducible in F[x]. Then 

(a) F is a finite field with 7 elements; 

(b) F is a finite field with 13 elements; 

(c) F is a finite field with 3 elements; 

(d) F = Q of rational numbers. 

7. Let co be an imaginary cube root of unity i.e., co is a complex number such that 
co 3 = 1 and co ± 1. If K is the field Q(y/2, co) generated by \[2 and co over the 
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field Q of rational numbers and n is the number of subfields L of K such that 
Q £ L C K, then n is 


(a) 4 (b) 3 (c) 5 (d) 2. 

8. Let F$n be the finite field with 5 n elements. If F$n contains a non-trivial root of 
unity, then n is 

(a) 15 (b) 6 (c) 92 (d) 30. 

9. Which of the following statements is (are) correct? 

(a) sin 7° is algebraic over Q; 

(b) sin -1 1 is algebraic over Q; 

(c) cos( 7 t/ 17) is algebraic over Q; 

(d) \[2 + is algebraic over Q(n). 

10. Which of the following statements is (are) correct? 

(a) There exists a finite field in which the additive group is not cyclic; 

(b) Every infinite cyclic group is isomorphic to the additive group of integers; 

(c) If F is a finite field, there exists a polynomial p over F such that p(x) = 0 
for all x e F, where 0 denotes the zero in F\ 

(d) Every finite field is isomorphic to a subfield of the field of complex num- 
bers. 

11. Consider the ring Z n for n > 2. Which of the following statements is (are) 
correct? 

(a) If Z n is a field, then n is a composite integer; 

(b) If Z n is a field iff n is a prime integer; 

(c) If Z n is an integral domain, then n is prime integer; 

(d) If there is an injective ring homomorphism of Z5 to Z w , then n is prime. 

12. Let F p be the field Z p , where p is a prime. Let F p [x ] be the associated poly- 
nomial ring. Which of the following quotient rings is (are) fields? 

(a) F 5 [x]/(x 2 +x + l); (b) F 2 [x]/{x 3 +x + 1); 

(c) F 3 [x]/{x 3 +x + 1); (d) F 7 [x\/(x 2 + 1). 


11.10 Additional Reading 

We refer the reader to the books (Adhikari and Adhikari 2003, 2004; Alaca and 
Williams 2004; Artin 1991; Birkhoff and Mac Lane 2003; Hardy and Wright 2008; 
Hungerford 1974; Rotman 1988) for further details. 
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Chapter 12 

Introduction to Mathematical Cryptography 


The aim of this chapter is to introduce various cryptographic notions starting from 
historical ciphers to modern cryptographic notions by using mathematical tools 
mainly based on number theory, modern algebra, probability theory and information 
theory. In the modern busy digital world, the word “cryptography” is well known 
to many of us. Everyday, knowingly or unknowingly, in many places we use differ- 
ent techniques of cryptography. Starting from the logging on a PC, sending e-mails, 
withdrawing money from ATM by using a PIN code, operating the locker at a bank 
with the help of a designated person from the bank, sending message by using a mo- 
bile phone, buying things through Internet by using a credit card, transferring money 
digitally from one account to another over the Internet, we are applying cryptogra- 
phy everywhere. If we observe carefully, we see that in every case we need to hide 
some information or it is necessary to transfer information secretly. So intuitively we 
can guess that cryptography has something to do with security. In this chapter, we 
introduce cryptography and provide a brief overview of the subject and discuss the 
basic goals of cryptography and understand the subject, both intuitively and mathe- 
matically. More precisely various cryptographic notions starting from the historical 
ciphers to modem cryptographic notions like public-key encryption schemes, sig- 
nature schemes, oblivious transfer, secret sharing schemes and visual cryptography 
by using mathematical tools mainly based on modern algebra are explained. Finally, 
the implementation issues of three public-key cryptographic schemes, namely RSA, 
ElGamal, and Rabin using the open-source software SAGE are discussed. 


12.1 Introduction to Cryptography 

Cryptography has been used almost since the time when writing concept was in- 
vented. For the larger part of its history, cryptography remained an art, a game of ad 
hoc designs and attacks. Historically, the notion of cryptography arose as a means 
to enable parties to maintain privacy of the information they sent to each other, even 
in the presence of an enemy who even had an access to the communicated message. 
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The name cryptography comes from the Greek words “kruptos” (means hidden) 
and “graphia” (means writing). In short, cryptography means the art of writing se- 
crets message. In general, the aim of cryptography is to construct efficient schemes 
achieving some desired functionality, even in an adversarial environment. For ex- 
ample, the most basic question in cryptography is that of secure communication 
across an insecure channel. Before we start describing the major objectives of cryp- 
tography, let us first introduce the basic model of cryptography. Under this model, 
two parties, conventionally named Alice (sender) and Bob (receiver), wish to com- 
municate with each other over an insecure channel (think of the Internet or mobile 
network) in which Alice wishes to send a secret message to Bob. Now the first 
question is: what do we mean by an insecure channel? Insecure channel means a 
channel through which the message will be sent to Bob is controlled by a whole lot 
of adversaries (enemies) who are trying to get hold of the message of Alice or to 
modify it and trick Bob into believing that some other message was sent by Alice. In 
cryptography, we assume that the adversary has practically unlimited computation 
power. He cannot only eavesdrop over any message, but also even take the message 
and change some of its bits. Traditionally, the two basic goals of cryptography have 
been the secrecy and authenticity (or data-integrity). It is necessary in secrecy that 
the message should be guarded from the adversary, that is, the message that has been 
sent by Alice to Bob should be hidden from the adversary and only the intended re- 
cipient, i.e., Bob can see the message. On the other hand, authenticity ensures that 
Bob should not be tricked into accepting a message that did not originate from Al- 
ice, that is, the message received by Bob must be the same as the message that was 
sent by Alice. To achieve such goals, we must note that there must be something that 
Bob knows but the adversary does not know, otherwise the adversary could simply 
apply the same method that Bob does and thus be able to read the messages sent by 
Alice. 

Under these circumstances in presence of adversaries, achieving these security 
goals such as privacy or authenticity is provided by cryptography with the help of 
a weapon, known as a protocol to Alice and Bob. A protocol is just a collection of 
programs or algorithms. A protocol for Alice will guide her to package or encapsu- 
late her data for transmission, on the other hand, the protocol for Bob will guide him 
to decapsulate the received package from Alice to recover the data together possibly 
with associated information telling him whether or not to regard it as authentic. 

To understand the basic cryptography, we first need to understand the basic ter- 
minologies associated with cryptography. The original secret message that Alice 
wants to send to Bob is called plain text, while the disguised message that is sent 
by Alice to Bob is called cipher text. The procedure of converting a plain text into a 
cipher text is known as encryption, while the procedure to convert a cipher text into 
the plain text is known as decryption. 

Let us now define formally what do we mean by a cryptosystem. 

Definition 12.1.1 A cryptosystem is a five-tuple (V,C,f C,E,V), where the fol- 
lowing conditions are satisfied: 
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• V is a finite set of possible plain texts; 

• C is a finite set of possible cipher texts; 

• /C, the key space, is a finite set of possible keys; 

• For each K e 1C, there is an encryption rule ex e £ and a corresponding decryp- 
tion rule dx £ 'D. Each ex ’V ^ C and dx : C -> P are functions such that 
dx (ex (x)) = x for every plain text element x e V. 


12.2 Kerckhoffs’ Law 

In cryptography, Kerckhoffs’ law (also called Kerckhoffs’ assumption or Kerck- 
hoffs’ principle) was stated by Auguste Kerckhoffs (1835-1903), a professor of 
languages at the School of Higher Commercial Studies, Paris, in the 19th century. 
It states that a cryptosystem should be secure even if everything about the system, 
except the key, is of public knowledge. It was reformulated (perhaps independently) 
by Claude Shannon as “the enemy knows the system”. The idea is that if any part 
of a cryptosystem (except the key) has to be kept secret then the cryptosystem is 
not secure. So in cryptography it is usually assumed that the enemy i.e., the adver- 
sary knows the cryptosystem that is being used. Of course, if the adversary does not 
know the cryptosystem that is being used, then it will make the task of the adversary 
more difficult. Thus the goal of a cryptographer is to design a secure cryptosystem 
which satisfies Kerckhoffs’ principle. 

Broadly speaking, a cryptographer’s work may be divided into two parts: namely, 
the constructive part (cryptography) and the destructive part (cryptanalysis). In the 
constructive part, the aim of a cryptographer is to construct a secure cryptosystem, 
on the other hand, in the destructive part, the aim of a cryptographer is to find weak- 
nesses of some existing cryptosystem to break the system. For example, the cryp- 
tographers associated with the banking sectors try to make their systems secure so 
that all the transactions over the Internet can be made safely, whereas the cryptogra- 
phers of the defense or intelligence organizations try to break the code that had been 
transmitted between two suspects. As a whole, both cryptography and cryptanalysis 
are included in the subject, called “CRYPTOLOGY”. There is no doubt that cryp- 
tography and cryptanalysis are complementing each other and are being developed 
side by side. To construct a good and secure cryptographic scheme one has to have 
a good knowledge in the cryptanalysis techniques. On the other hand, to attack a 
scheme, one has to have a good knowledge in the construction of the scheme. Thus 
unless one has a good knowledge in cryptanalysis, it is not possible for him/her to 
construct a good cryptographic scheme and vice versa. 


12.3 Cryptanalysis 

Cryptanalysis is the study of a cryptographic system for the purpose of finding 
weaknesses in the system and breaking the code used to encrypt the data without 
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accessing the secret information which is normally required to do so. Typically, this 
involves finding the secret key. Although the actual word “cryptanalysis” is rela- 
tively recent (it was coined by William Friedman in 1920), methods for breaking 
codes and ciphers are much older. The first known recorded explanation of crypt- 
analysis was given in 9th Century by Arabic polymath Abu Yusuf Yaqub ibn Ishaq 
al-Sabbah Al-Kindi in A Manuscript on Deciphering Cryptographic Messages. This 
manuscript includes a description of the method of frequency analysis. 

Even though the goal has been the same, the methods and techniques of crypt- 
analysis have changed drastically through the evolution of cryptography, ranging 
from the pen-and-paper methods of the past, through machines like Enigma in World 
War II, to the computer-based schemes of the present. Even the results of cryptanaly- 
sis have changed, it is no longer possible to have unlimited success in code breaking. 
In the mid-1970s, a new class of cryptography, known as asymmetric cryptography, 
was introduced in the literature of cryptology. Methods for breaking these cryptosys- 
tems are different from before and usually involve solving a carefully constructed 
problem in pure mathematics, the most well known being integer factorization. 

Now we will discuss different attack models on cryptography. The most common 
types of attacks are given below. 

• Cipher text only attack : In cryptography, a cipher text only attack is a form of 
cryptanalysis, where the attacker is assumed to have access only to a set of cipher 
texts. In the history of cryptography, early ciphers, implemented using pen-and- 
paper, were routinely broken using cipher texts alone. Cryptographers developed 
a variety of statistical techniques for attacking cipher text, such as, frequency 
analysis. Mechanical encryption devices, such as, Enigma made these attacks 
much more difficult. This cipher machine was invented by a German engineer, 
Arthur Scherbius. The Enigma machine was an ingenious advance in technology. 
It is an electromechanical machine, very similar to a typewriter, with a plugboard 
to swap letters, rotors to further scramble the alphabet and a lamp panel to dis- 
play the result. Most models of Enigma used 3 or 4 rotors with a reflector to allow 
the same settings to be used for enciphering and deciphering. The German Navy 
and Army adopted the Enigma in 1926 and 1928, respectively, but only added 
the plugboard in 1930. The history of breaking the code for Enigma by Polish 
mathematicians is very exciting. The Polish were understandably nervous about 
German aggression and on September 1, 1932 the Polish Cipher Bureau hired a 
27-year-old Polish mathematician, Marian Rejewski, along with two fellow Poz- 
nan University mathematics graduates, Henryk Zygalski and Jerzy Rozycki, to try 
to break the code of the Enigma machine. This was an early insight into the role 
of mathematics in code breaking. The three Polish code breakers had access to an 
Enigma machine, but did not know the rotor wiring. Through a German spy, the 
French gained access to two months of Enigma key settings. But without the rotor 
wire, they were unable to make use of this information. They passed this infor- 
mation to their British and Polish colleagues and the Polish were able to quickly 
solve the Enigma puzzle, recreating the three rotors then in use to mount a suc- 
cessful cipher text-only cryptanalysis to the Enigma. This was in March 1933 and 
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they continued to break the code until the Nazis invaded Poland on September 1, 
1939, marking the start of WW2. 

• Known plain text attack : In cryptography, a known plain text attack is a form of 
cryptanalysis, where the attacker has a cipher text corresponding to a plain text of 
an arbitrary message not of his choice. 

• Chosen plain text attack : In cryptography, a chosen plain text attack is a form 
of cryptanalysis, where the attacker has the capability to compute the cipher text 
corresponding to an arbitrary plain text message of his choice. 

• Chosen cipher text attack : In cryptography, a chosen cipher text attack is a form 
of cryptanalysis, where the attacker has the temporary access to the decryption 
machinery to find the plain text corresponding to a cipher text message of his 
choice. 

In each case the objective to the attacker is to determine the key that was used. We 
observe that chosen cipher text attack is most relevant to a special type of cryptosys- 
tem known as asymmetric or public-key cryptosystem. 


123.1 Brute-Force Search 

Now we will discuss informally an important term in cryptanalysis known as brute- 
force search. Suppose a cryptanalyst has found a plain text and a corresponding 
cipher text but he does not know the key. He can simply try encrypting the plain text 
using each possible key, until the cipher text matches or try decrypting the cipher 
text to match the plain text. This is called brute-force search. So every cryptosys- 
tem can be broken using brute-force search attack. However, every well-designed 
cryptosystem has such a large key space that this brute-force search is not practical. 

In academic cryptography, a weakness or a break in a scheme is usually defined 
quite conservatively. Bruce Schneier sums up this approach: “Breaking a cipher sim- 
ply means finding a weakness in the cipher that can be exploited with a complexity 
less than brute-force. Never mind that brute-force might require 2 128 encryptions; an 
attack requiring 2 110 encryptions would be considered a break. . .simply put, a break 
can just be a certificational weakness: evidence that the cipher does not perform as 
advertised.” 


12.4 Some Classical Cryptographic Schemes 

Cryptography is as old as writing itself and has been used for thousands of years to 
safeguard military and diplomatic communications. It has a long fascinating history. 
Kahn’s (1967) The Codebreakers is the most complete non-technical account of the 
subject. This book traces cryptography from its initial and limited use by Egyptians 
some 4000 years ago, to the twentieth century where it played a critical role in the 
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outcome of both the world wars. Before the 1980s, cryptography was used primar- 
ily for military and diplomatic communications and in fairly limited contexts. But 
now cryptography is the only known practical method for protecting information 
transmitted through communication networks that use land lines, communication 
satellites and microwave facilities. In some instances it can be the most economical 
way to protect stored data. 

Let us start with some classical examples of cryptographic schemes in which 
both the sender and the receiver agree upon a common key secretly before the ac- 
tual communication starts. We shall explain later that this concept leads to a very 
important notion of cryptography known as symmetric key cryptosystem. 


12.4.1 Caesarian Cipher or Shift Cipher 

This cryptographic scheme was discovered by Julius Caesar, used around 2000 years 
ago, much earlier than the invention of group theory. Caesar used his scheme to send 
instructions or commands through messengers to his generals and allies who stayed 
at the war front. Now if the message was sent in a simple text form (plain text), it 
could be caught by the enemies or the messenger might read the secret messages. To 
avoid such a situation, Julius Caesar introduced a new method. Before going to the 
war front Julius Caesar and his generals agreed upon a secret number, say 3, which 
is the key of the cryptosystem. When Julius Caesar needed to send messages to 
the generals at the war front, he just cyclically shifted every letter of the command 
or instruction by 3 positions to the right. If we take the example of the English 
alphabets, without differentiating the lower and the upper case, we have altogether 
26 letters. So if we shift cyclically each letter of the English alphabets 3 times to 
the right, A will be shifted to D, B will be shifted to E, ... ,X will be shifted to 
A, Y will be shifted to B and finally, Z will be shifted to C. If a message said 
“ROME”, it would look like “URPH” (as R -> U, O -* R, M -> P and E -> H). 
Now those who know the value of the shift to be 3 (in cryptography we call this 
value as “key”), they will shift cyclically each letter of the cypher text 3 times to the 
left and can easily recover the plain text. 

Now think about the Caesarian shift cipher in today’s view point. The question 
is: can we associate the Caesarian shift cipher with mathematics? Today, we have 
an important tool of mathematics, known as Cyclic Group. We can number each 
alphabet A, B, ..., Z of English literature (not distinguishing the lower and upper 
cases) as 0, 1, . . . , 25, respectively. Since, each letter is shifted cyclically to the right 
(in the above example, it is shifted 3 times to the right), we can represent the set of 
English alphabets as the set Z 26 , the set of all integers modulo 26. We know that 
(Z 26 , +) forms a cyclic group with respect to addition modulo 26 defined on Z 26 
properly. Now as each letter is shifted cyclically k times to the right, we can define 
the encryption function, e £, from Z 26 onto Z 26 for the fixed k e Z 26 , defined by 
ek(x) = x +k, Vv e Z 26 . So the decryption function can be written as <4(v) = x—k, 
Vx £ Z 26 . 
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Now the question is: In today’s computer age, is the shift cipher secure? Assume 
that an adversary only has a piece of cipher text along with the knowledge that 
this cipher text is obtained by using shift cipher. The adversary does not have the 
knowledge of the secret key k. However, it is very easy for the adversary to have a 
“cipher text only attack” as there are only 26 possible keys. So the adversary will 
try every key and will see which key decrypts the cipher text into a plain text that 
“makes sense”. As mentioned earlier, such an attack on an encryption scheme is 
called a brute-force attack or exhaustive search attack. Clearly, any secure encryp- 
tion scheme must not be vulnerable to such a brute-force attack; otherwise, it can 
be completely broken, irrespective of how sophisticated the encryption algorithm 
is. This leads to a trivial but important principle, called, the “sufficient key space 
principle”. It states that 

Any secure encryption scheme must have a key space that is not vulnerable to exhaustive 
search. 

So we try to increase the size of the key space in the next cryptographic scheme. 


12.4.2 Affine Cipher 


Let us take V = C = Z 26 . Let 1C = {(a, b ) e Z 26 x Z 26 ' gcd (a, 26) = 1}. For K = 
(1 a,b ) e /C, let us define : V — > C and <4 : C — > V by ex (x) = (ax + b) mod 26 
and dx(y) = a~ 1 (y — b) mod 26, respectively, where (jc, y) e Z 26 . In this case, the 
size of the key space is 26 • 12 = 312 which is more than 26, the key space size of 
Caesarian shift cipher. 

Remark In today’s computer age, for an exhaustive search, a very powerful com- 
puter or many thousands of PC’s that are distributed around the world may be used. 
Thus, the number of possible keys must be very large, must be an order of at least 
2 60 or 2 70 . But we must emphasize that the “sufficient key space principle” gives a 
necessary condition for security, not a sufficient one. The next encryption scheme 
has a very large key space. However, we shall show that it is still insecure. 


12.4.3 Substitution Cipher 

Let us take V = C = Z 26 . Let 1C be the set of all possible permutations on the set 
{0, 1, 2, . . . , 25}. For each permutation n, let us define e n : V — > C and d n : C 
V by e n (x) = 7t(x) and d n (y) = n ~ l (y) respectively, where 7r _1 is the inverse 
permutation of tv. Then one can verify that (V, C, 1C, £ , V) is a cryptosystem. The 
total number of possible keys is 26! = 403291461126605635584000000 which is 
more than 4.0 x 10 26 , a very large number. 

Let us consider the plain text written in English language. In that case, the V = 
C = Z 26 . Let us take a secret key n as defined in Table 12.1. Then the plain text 
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Table 12.1 A permutation rule n in which A is replaced by d, i.e., n(A) = d, B is replaced by /, 
i.e., 7 t(B) = / and so on 


A 

B 

c 

D 

E 

F 

G 

H 

/ 

J 

K 

L 

M 

d 

f 

h 

l 

n 

a 

c 

e 

g 

i 

k 

m 

o 


N 

O 

P 

Q 

R 

S 

T 

U 

V 

W 

X 

Y 

Z 

q 

s 

u 

W 

y 

z 

b 

j 

P 

r 

t 

V 

X 

Table 12.2 

Percentage of frequency of letters in English text 






A 

B 

C 

D 

E 

F 

G 

H 

I 

J 

K 

L 

M 

8.15 

1.44 

2.76 

3.79 

13.11 

2.92 

1.99 

5.26 

6.35 

0.13 

0.42 

3.39 

2.54 


N 

O 

P 

Q 

R 

S 

T 

U 

V 

W 

X 

Y 

Z 

7.10 

8.00 

1.98 

0.12 

6.83 

6.10 

10.47 

2.46 

0.92 

1.54 

0.17 

1.98 

0.08 


“CRYPTOGRAPH YISFUN” becomes “hyvubscyduevgzajq”. A brute-force attack 
on the key space for this cipher takes much longer than a lifetime, even using the 
most powerful computer known today. However, this does not necessarily mean that 
the cipher is secure. In fact, as we will show now, it is easy to break this scheme even 
though it has a very large key space. For example, an attacker can easily attack the 
substitution cipher, written in English language, because the letters in the English 
language (or any other human language) are not random. For instance, the letter q 
in English is always followed by the letter u in any word. Moreover, certain letters, 
such as e and t appear far much more frequently than other letters, such as j and q. 
Table 12.2 lists the letters with their typical frequencies in English text. So we see 
that the most frequent letter is e, followed by t, a, o, and n. It may also be useful to 
consider sequences of two or three consecutive letters called digrams and trigrams , 
respectively. The 30 most common digrams are (in decreasing order) TH, HE, IN, 
ER, AN, RE, ED, ON, ES, ST, EN, AT, TO, NT, HA, ND, OU, EA, NG, AS, OR, TI, 
IS, ET, IT, AR, TE, SE, HI, OF. The 12 most common trigrams are (in decreasing 
order) THE, ING, AND, HER, ERE, ENT, THA, NTH, WAS, ETH, FOR, DTH. 

Based on these features of English language and the fact that in a substitution 
cipher a particular letter is always replaced by some other fixed letter, the attack, 
known as , frequency analysis , on the substitution may be done as follows: 

• The number of different cipher text characters or combination of characters are 
counted to determine the frequencies of usages. 

• The cipher text is examined for patterns, repeated series and common combina- 
tions. 

• The cipher text characters are replaced with possible plain text equivalents using 
the known language characteristics. 
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Remark Note that Caesar Cipher is a special case of Substitution Cipher which 
includes only one of the 26! possible permutations on 26 elements. 

All the examples discussed till now have a common feature, that is, all the plain 
text alphabets are going to a fixed cipher text alphabet. This feature leads to the 
following important definition. 

Definition 12.4.1 A cryptosystem is said to be a mono-alphabetic cryptosystem if 
and only if each alphabetic character is mapped into a unique alphabetic character 
once the key of the cryptosystem is fixed. 

As we have described, the frequency attack on the mono- alphabetic substitution 
cipher could be carried out, because the mapping of each alphabet was fixed. Thus 
such an attack may be avoided by mapping different instances of the same plain 
text alphabet to different cipher text alphabets. In that case, counting the character 
frequencies will not offer much information about the mapping. This feature leads 
to the following definition. 

Definition 12.4.2 A cryptosystem is said to be a poly-alphabetic cryptosystem if 
and only if different instances of the same plain text alphabet are mapped into dif- 
ferent cipher text alphabets. 

In the next section we are going to introduce an example of a poly- alphabetic 
cryptosystem. 


12.4.4 Vigenere Cipher 

Let n be a positive integer. Define C = V = JC = ( Z 26 ) n • For a key K = 
0 k \ , & 2 , . . . , we define 


ek(x \,X 2 , . . . ,x n ) = (. x\ + k\, X 2 + & 2 , • • • , x n +k n )m od 26 

dk(yi,y2, - --,yn) = (yi-ki,y2-k2,...,yn- k n ) mod 26 

Intuitively, the Vigenere cipher works by applying multiple shift ciphers in se- 
quence. Let us consider the following example. 

Example 12.4.1 Let the plain text message be CRYPTOLOGY and the key be CIP. 
Let us represent the alphabets as elements of Z 26 by identifying A by 0, B by 1 and 
so on. 

Then the scheme is as follows: 


c 

R 

Y 

P 

T 

0 

L 

0 
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Y 

2 

17 

24 

15 

19 

14 

11 

14 

6 

24 

c 

I 
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C 

1 
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1 

P 
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2 

8 

15 

2 

8 

15 

2 

8 

15 

2 

e 

z 
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4 

25 

13 

17 
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13 

22 
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0 
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Remark Note that the two different instances of the same plain text alphabet O in 
the word CRYPTOLOGY are mapped into different cipher text alphabets, namely 
d and w. If the key is a sufficiently long word (chosen at random), then breaking 
this cipher seems to be a very difficult task. Indeed, it was considered by many to be 
an unbreakable cipher and although it was invented in the 16th century a systematic 
attack on the scheme was only devised hundreds of years later. For more details, 
we refer the readers to the text books (Menezes et al. 1996; Stinson 2002; Katz and 
Lindell 2007). 


12.4.5 Hill Cipher 


Hill cipher was first introduced in 1929 by the mathematician Lester S. Hill (1929). 
This cipher may be thought of as the first systematic and simple poly-alphabetic 
cipher, in which Algebra, specially Linear Algebra was used. Intuitively, in the Hill 
cipher, the plain text is divided into groups of adjacent letters of the some fixed 
length, say n and then each such group is transformed into a different group of n 
letters. An arbitrary zz-Hill cipher has as its key a given n x n non- singular matrix 
whose entries are integers 0,1,2, ...,£ — 1, where t is the size of the set of alphabets 
i.e., if the alphabets are the English letters only, then t = 26. The Hill cipher for the 
English alphabets may be explained as follows: For n > 2, let V = C = Z^ 6 and 
JC = {K : K is a non- singular matrix of order n with entities from Z 26 }. Let us take 
an n yen non- singular matrix K = ( kij) nxn as our key. For x = (xi , X 2 , . . • , x n ) eV 
and K efC, the encryption rule ck(x) = y = (y \ , y 2 , . . . , y n ) is defined as follows: 


(yi , yi, • • • , y n ) = (*1 , *2, • • • , x n ) 


*11 

hi ■■ 

k\n 

hi 

&22 ' ' 

kin 

k n \ 

kn2 ■■ 

knn 


So idea is to take n linear combinations of the n alphabetic characters in one cipher 
text element. Now we shall show how to decrypt the cipher text. The above equation 
can be written as y = xK. As K is a non- singular matrix, for decryption, we can 
use, dx(y) = yK~ l = x, where all the operations are performed in Z 26 . 


Example 12.4.2 Let us consider a simple example of a Hill cipher with the set of 
alphabets as the English letters only, i.e., “a” to “z” only. While encrypting and 
decrypting, we shall represent the 26 characters in our alphabet in order by the non- 
negative integers 1, 2, . . . , 26 (=0). Let m = 2, i.e., we are considering only the Hill 
2-cipher with the key K = [ ^ ] . Then the plain text element x could be written as 

x = (xi, X 2 ) and the cipher text element as y = (y\, 3 ^ 2 ) = (.x:i , -X 2 ) [ 25 ]* Suppose 
Alice wants to send some secret message say “LOVE” to Bob over an insecure 
channel using the above mentioned Hill 2-cipher. Since, m = 2, Alice will break 
“LOVE” into two parts, each containing two characters i.e., “LO” and “VE”. The 
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corresponding integer values are (12, 15) and (22, 5), respectively, since L, O, V, E 
corresponds to the integers 12, 15, 22, and 5, respectively. Note that the matrix K is 
known to both Alice and Bob. For encryption Alice will compute 


(12, 15) 


7 3 
2 5 


mod 26 = (10, 7) = (j, g) and 


( 


(22,5) 


7 3 
2 5 


mod 26 = (8, 13) = (h, m). 


So Alice will send to Bob the cipher text “jghm” over an insecure channel. After 
receiving it, Bob will break “jghm” into “jg” and “hm” and convert it into (10, 7) 
and (8,13) and will compute: 


(10,7) 
and 
(8, 13) 


7 3 
2 5 


-i-i> 


| mod 26= (10,7) 


7 3 
2 5 


-i-l\ 


| mod 26= (8, 13) 


19 25 
8 11 


19 25 
8 11 


J mod 26 = (12, 15 ) = (l,o) 


mod 26 = (22, 5) = (v, e). 


So Bob gets back the message “love” sent from Alice. 


Remark For the cryptanalysis of the Hill cipher, we refer the readers to the text 
book (Stinson 2002). 


In all of the above ciphers, if we observe carefully, we note that before sending 
the secret message, the sender and the receiver have to agree upon a common key 
secretly. The same key is used for both encryption and decryption. This type of 
cryptosystem is known as private-key or symmetric key cryptosystem. One drawback 
of a private-key cryptosystem is that it requires a prior secure communication of the 
common key k between sender and receiver, using a secure channel, before any 
cipher text is transmitted. In practice, this may be very difficult to achieve as the 
sender (say Alice) and the receiver (say Bob) may not have the luxury of meeting 
before hand or they may not have an access to a reasonably secure channel. Since 
Alice and Bob are not meeting in private, we have to assume that the adversary or 
the attacker is able to hear everything that Alice and Bob pass back and forth. Under 
this situation, is it possible for Alice and Bob to exchange a secret key for future 
secure communication? This leads to an interesting concept known as public-key or 
asymmetric cryptosystem. 


12.5 Introduction to Public-Key Cryptography 

Intuitively, in a public-key cryptosystem, there are two keys instead of one secret 
key. One of these keys is an encryption key, used by sender to encrypt the mes- 
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sage and the other is a decryption key, used by the receiver to decrypt the cipher 
text. The most important thing is that the secrecy of encrypted messages should be 
preserved even against an adversary who knows the encryption key (but not the de- 
cryption key). Cryptosystems with this property are called asymmetric or public-key 
cryptosystems, in contrast to the symmetric or private-key encryption schemes. In a 
public-key cryptosystem the encryption key is called the public key, since it is kept 
open or published by the receiver so that anyone who wishes to send an encrypted 
message may do so and the decryption key is called the private key since it is kept 
completely private by the receiver. 

There are certain advantages of public-key cryptography over the secret key 
cryptography. As the public-key encryption allows key distribution to be done over 
public channels, it simplifies the initial deployment of the system. As a result, the 
maintenance of the system becomes quite easy when parties join or leave the cryp- 
tosystem. Further, if we are going to use a public-key cryptosystem, then we do not 
need to store many secret keys for further secure communication. Even if all pairs 
of parties want the ability to communicate securely, each party needs only to store 
his/her own private key in a secure fashion. The public keys of the other parties can 
either be obtained when needed, or stored in a non-secure fashion. Finally, public- 
key cryptography plays an important role in the scenario where parties, who have 
never previously interacted, want the ability to communicate securely. For example, 
a merchant may post his public key on-line; any buyer making a purchase can obtain 
the merchant’s public key when they need to encrypt their credit card information. 

Asymmetric cryptography is a relatively new field. Whiteld Diffie and Mar- 
tin Heilman, in 1976, published a paper entitled “New Directions in Cryptogra- 
phy” (Diffie and Heilman 1976) in which they formulated the concept of a public- 
key encryption system. A short time earlier, Ralph Merkle had independently in- 
vented a public-key construction for an undergraduate project at Berkeley, but this 
was little understood at the time. Merkle’s paper entitled “Secure communication 
over insecure channels” appeared in Merkle (1982). However, it turns out that the 
concept of public-key encryption was originally discovered by James Ellis while 
working at the British Government Communications Headquarters (GCHQ). Ellis’s 
discoveries in 1969 were classified as secret material by the British government and 
were not declassified and released until 1997, after his death (cf. Hoffstein et al. 
2008). 

The Diffie-Hellman key agreement was invented in 1976 during a collaboration 
between Whitfield Diffie and Martin Heilman and was the first practical method 
for establishing a shared secret over an insecure communication channel. The most 
important contribution of the paper by Diffie and Heilman (1976) was the definition 
of a Public Key Cryptosystem (PKC) and its associated components namely one- 
way functions and trapdoor information as defined below. 

Definition 12.5.1 A one-way function is an invertible function that is easy to com- 
pute, but whose inverse is difficult to compute i.e., if any algorithm that attempts to 
compute the inverse in a reasonable amount of time will almost certainly fail, where 
the phrase almost certainly must be defined probabilistically. 
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A secure public-key cryptosystem is built using one-way functions that have a 
trapdoor which is defined below. 

Definition 12.5.2 The trapdoor of a function is a piece of auxiliary information that 
allows the inverse of the function to be computed easily but without that auxiliary 
information it is computationally very hard to compute the inverse function. 

Informally, a public-key (or asymmetric) cryptosystem consists of two types of 
keys, namely, a private key k pr \ y and a public key /c pu b - Corresponding to each 
public/private-key pair (& pr iv, & P ub), there is an encryption algorithm e kpuh and the 
corresponding decryption algorithm d kpriw . The encryption algorithm e kpuh corre- 
sponding to £ pub is public of knowledge and easy to compute. Similarly, the decryp- 
tion algorithm d kpriv must be easily computable by someone who knows the private 
key & pr iv, but it should be very difficult to compute for someone who knows only 
the public key & pub . The idea behind a public-key cryptosystem is that it might be 
possible to find a cryptosystem where it is computationally infeasible to determine 
the decryption function d kpm , given the encryption function e kpub and the public key 
kpub but with the knowledge of secret key & pr i v it is easy to compute the inverse of 
e kpuh , i- e -, <4 priv . In other words, the private key /c pr j v may be thought of as a trapdoor 
information for the function e kpuh . Under this situation, say Alice wants to send a 
secret message to Bob. Then Alice will use the encryption rule e kpuh that is made 
public by Bob, to encrypt the secret message and will send the encrypted message 
to Bob through an insecure channel. A public-key cryptosystem is said to be secure 
if it is computationally infeasible to determine d kpriw given e kpuh and & pub . In that 
case, only Bob can decrypt the cipher text to get the plain text. The main advantage 
of a public-key cryptosystem is that Alice can send an encrypted message to Bob, 
without any prior communication of a secret key, by using the public encryption rule 
e kpuh (hence the name public-key cryptosystem). The formal definition of public-key 
cryptosystem is as follows. 

Definition 12.5.3 A public-key (or asymmetric) cryptosystem is a five-tuple 
0 P , C,JC,£,V), where the following conditions are satisfied: 

• V, the plain text space, is a finite set of possible plain texts, 

• C, the cipher text space, is a finite set of possible cipher texts, 

• /C, the key space, is a finite set of possible keys, 

• For each k = (k pr \ v , /c pub ) e /C, there is an encryption rule e kpub e £ and the cor- 
responding decryption rule d kpnw e V. Each e kpub : V -> C and d kpnw : C -> V are 

functions such that d kpriw (e kpuh (*)) = x for every plain text element x eV. 

It is quite interesting to note that Diffie and Heilman formulated this concept 
without finding a specified encryption/decryption pair of functions, although they 
did propose a similar method through which Alice and Bob can securely exchange 
a random piece of data whose value is not known initially to either one. Before de- 
scribing the method, let us first give a brief introduction about the discrete logarithm 
problem. 



530 


12 Introduction to Mathematical Cryptography 


12.5.1 Discrete Logarithm Problem (DLP) 

The discrete logarithm problem is a well-known mathematical problem that plays an 
important role in public-key cryptography. The Diffie-Hellman Key Exchange pro- 
tocol (Diffie and Heilman 1976) and ElGamal public-key cryptosystems are based 
on the discrete logarithm problem in a finite field Z p , where p is a large prime. Let 
us first explain the problem. 

Let p be a large prime. Recall that Z* = Z p \ {(0)} is a cyclic group of order 
p — 1 (see Chap. 10). Thus there exists a primitive element, say g of Z*. As a result, 
every non-zero element of Z p is equal to some power of g. Further recall that, by 
Fermat’s Little Theorem, g p ~ l = (1) and no smaller power of g is equal to (1). 
Hence all the elements of Z* may be represented as Z* = {g, g 2 , . . . , g p ~ 1 }. Now 
we are going to define the discrete logarithm problem in Z* . 

Definition 12.5.4 The Discrete Logarithm Problem (DLP) in Z* states that given 
a primitive element g of Z* and an element h e Z* , find x such that g x = h. The 
number x is called the discrete logarithm of h modulo p to the base g and is denoted 
by lo g g (p). 

Remark Till date, in general, there is no efficient algorithm known to solve the DLP. 

Though here, we define the discrete logarithm problem in terms of the primitive 
base g, in general this is not mandatory. We can take elements of any group and use 
the group law instead of multiplication. This leads to the most general form of the 
discrete logarithm problem as follows. 

Definition 12.5.5 The Discrete Logarithm Problem (DLP) in a group (G, o) states 
that, given an element g and h in G, find x such that 

g°g°---°g = h. 

x times 

Remark Discrete logarithm problem is not always hard. This hardness depends on 
the groups. For example, a popular choice of groups for discrete logarithm-based 
crypto- systems is Z* where p is a prime number. However, if p — 1 is a product of 
small primes, then the Pohlig-Hellman algorithm can solve the discrete logarithm 
problem in this group very efficiently. That is why we always need a special type of 
prime p, known as safe prime which is of the form 2# + 1, where q is a large prime, 
when using Z* as the basis of discrete logarithm-based crypto- systems. 


12.5.2 Diffie-Hellman Key Exchange 

Until 1976, there was a belief that in cryptography, encryption could not be done 
without sharing a secret key among the sender and the receiver. However, Diffie and 


12.5 Introduction to Public-Key Cryptography 


531 


Heilman noticed that there is a natural asymmetry in the world such that there are 
certain actions that can be easily performed but not easily reversed. For example, 
it is easy to multiply two large primes but difficult to recover these primes from 
their product. The existence of such phenomena motivates to construct an encryp- 
tion scheme that does not rely on shared secrets, but rather one for which encrypting 
is “easy” but reversing this operation (i.e., decrypting) is infeasible for anyone other 
than the designated receiver. Using this observation, Diffie and Heilman solved the 
following interesting problem which looks quite infeasible. The problem may be 
stated as follows: Suppose Alice wants to share some secret information with Bob. 
This information may be a common key for future secure communication between 
Alice and Bob using symmetric key encryption. The only problem is that whatever 
communication will be made between Alice and Bob will be observed by the adver- 
sary. In this situation, the question is: how is it possible for Alice and Bob to share 
a key without making it available to the adversary? As mentioned earlier, at a first 
glance, it seems that it is quite impossible for Alice and Bob to solve the problem. 
But Diffie and Heilman first came up with a brilliant solution to this problem taking 
into account the difficulty of solving the discrete logarithm problem in Z* , where p 
is a large prime. 

The steps for Diffie-Hellman key exchange protocol are as follows: 

• At the first step, Alice and Bob agree on a large prime p and a generator, say g, 
of the cyclic group Z* . 

• Alice and Bob make the values of p and g public, i.e., these values are known to 
all, in particular known to the adversary. 

• Alice chooses a random integer a that she keeps secret, while at the same time 
Bob chooses randomly an integer b that he keeps secret. 

• Alice and Bob use the secret integers a and b to compute X = g a (mod p) and 
Y = g b (mod p ), respectively. 

• Alice sends through an insecure channel the value X to Bob while Bob sends 
through an insecure channel the value Y to Alice. (Note that the adversary can 
see the values of X and Y as they are sent over insecure channels). 

• After getting the value Y from Bob, Alice computes with the help of the 
secret value a , the value X r = Y a (mod /?), while Bob computes Y' = X b 
(mod p) with the help of the secret value b that he has. Note that as Z* is 
a commutative group, X' = Y a (mod p) = ( g b ) a (mod p) = (g) ab (mod p) = 
(X a ) b (mod p) = Y' (mod p). 

• Thus the common key for Alice and Bob is g ab (mod p). 

Note that the adversary knows the values of X = g a (mod p) and Y = 
g b (mod p). The adversary also knows the values of g and p. To the adversary, 
only a and b are unknown. So if the adversary can solve the DLP, then he can find 
both a and b and thus can compute easily the shared secret value g ab of Alice and 
Bob. At this point of time, though it seems that Alice and Bob are safe provided that 
the adversary is unable to solve the DLP, this is not quite correct. There is no doubt 
that one method of finding the shared value of Alice and Bob is to solve the DLP, 
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but that is not the precise problem that the adversary is intended to solve. The adver- 
sary is actually intended to solve the following problem, known as Diffie-Hellman 
Problem (DHP). 

Definition 12.5.6 Let p be a prime and g be a primitive element, i.e., the generator 
of the cyclic group Z* . The Diffie-Hellman Problem (DHP) is the problem of com- 
puting the value of g ab (mod p ) from the known values of g, p, g a (mod p) and 
g b (mod p). 

Remark There is no doubt that the DHP is no harder than the DLP i.e., if some one 
can solve the DLP, then he can compute the secret values a and b of Alice and Bob, 
respectively, to compute their common key g ab . However, the converse is less clear 
in the sense that if the adversary has an algorithm which can efficiently solve the 
DHP, then with that algorithm, can the adversary efficiently solve the DLP? The 
answer to this question is still not known. 

In the next section we are going to discuss about the celebrated public-key en- 
cryption scheme known as RS A, based on factorization of integers and simple group 
theoretic techniques. 


12.6 RSA Public Key Cryptosystem 

In 1977, a year after the publication of the paper by Diffie and Heilman, three re- 
searchers at MIT developed a practical method to construct a public-key cryptosys- 
tem. This became known as RSA, after the initials of the three developers: Ron 
Rivest, Adi Shamir and Leonard Adelman. RSA is probably the most widely used 
public-key cryptosystem. It was patented in USA in 1983. 

We now explain the RSA cryptosystem (Rivest et al. 1978). But before that we 
need a small introduction. Many of us have an impression that regarding compu- 
tation, computer has the greatest power and computer can compute any thing very 
quickly given to it. But unfortunately, there are many computational works that even 
computers cannot do quickly. Let us consider two three digit prime numbers say 
p = 101 and q = 113. Let n = pq. Now if we give n to the computer and ask com- 
puter, by writing a suitable program, to factorize n , the computer will work readily. 
Now we consider p and q, each being a prime of 512 bits, for example, we may con- 
sider p = 12735982667070981883239844093975974547292895820146911015162 
6034935609510207618413911349026653679890168137094014531829258413693 
09737598466522100221796766571 and q = 981563352854104834403698019941 
1006531775509024061499455934333970333629916904868989032937272369426 
112046753560494458851516387792268371864577065884840687021, then n = 
1250117384858195736607120322114276853166600971355669003183482912941 
6865795280125867985804891699478434726783362291586508315065615625486 
6360306001879355302167187240476838112856712813416806772774907378919 
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Table 12.3 Algorithm for RSA public key cryptosystem 

• Key Generation Algorithm by Bob: 

1. Generate two distinct large prime numbers p and q, each roughly of same size. 

2. Compute n = pq and 4>(n) = (p — l)(q — 1). 

3. Select a random integer e with 1 < e < 4>{n), such that gcd(e, 4>{n)) = 1. 

4. Use the extended Euclidean algorithm to find the integer d, 1 < d < 4>{n), such that 
ed = 1 (mod 4>{n)). 

5. Make public the public keys n and e (which are known to every body) and keep secret 
the private keys p, q, and d (which are known only to Bob). 

• Encryption Algorithm for Alice: 

1. Obtain Bob’s public key (n, e). 

2. Represent the message m as an integer from the set {0, 1, 2, — 1}. 

3. Compute c = m e (mod n). 

4. Send the cipher text c to Bob. 

• Decryption Algorithm for Bob: 

1. To obtain the plain text message m, Bob uses his private key d to get m =c d (mod n ). 


5351738841586300440343902785521910200349245788252828073180486676801 
31867741261022374538179761526720006374991 is an 1024 bit number having 
309 many decimal digits. Now if we input ntoa computer and ask it to factorize, 
then it will not be possible, even for a super computer, to factorize the number n in a 
reasonable time. So for a computer also the factorization of large number which is a 
product of two distinct large primes is considered to be a difficult or hard problem. 
This is because not yet any efficient algorithm for factorization has been discovered. 
This inability of computers added a strength to the public-key cryptographic scheme 
RSA which is described in the next section. 


12.6.1 Algorithm for RSA Cryptosystem (Rivest et al. 1978) 

Suppose Alice wants to send a secret message to Bob using the RSA public-key 
cryptosystem. Note that as Alice wants to send the message to Bob, Bob has to take 
the initiative to construct his public- and private-key pairs. This step is known as the 
key generation step. Alice then collects the public key of Bob and uses the publicly 
known encryption algorithm to get the cipher text. Alice then sends the cipher text to 
Bob through an insecure channel. After receiving the cipher text from Alice, using 
his private key, Bob decrypts the cipher text to get the plain text. The actual steps 
for RSA algorithm are described in Table 12.3. 

Thus formally, the RSA cryptosystem is a five-tuple (V, C, 1C, £ , V), where 
V = C = Z n , n is the product of two distinct large prime numbers p and q , 
JC = {(n, p,q,e,d) : ed = 1 (mod cf>(n))}, 0(A), known as Euler 0 function, is 
the number of positive integers not exceeding n and relatively prime to n. For 
each K = (n, p,q,e,d) e JC , the encryption function ex is defined by ex(x) = 
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x 6 (mod n) and the decryption function d k is defined by djc(y) = y d (mod n), 
where x,y eZ n . The values n and e are considered to be the public keys while the 
values p, q, and d are considered to be the private keys. 


12.6.2 Sketch of the Proof of the Decryption 

A natural question that comes to our mind: Is really m = c d (mod n)l The sketch 
of the proof is as follows. 

It is given that ed = 1 (mod 0(ft)). So there must exist some integer t such 
that ed = 1 + t(j){n). Since m e {0, 1, 2, . . . , n — 1}, we consider the following 
four cases. Case 1: gcd(m, p) = p and gcd(m,q) = q, (this case happens only 
when m = 0) Case 2: gcd(ra, p) = 1 and gcd(ra, q) = 1, Case 3: gcd(m, p) = 1 
and gcd(m,q) = q and finally, Case 4: gcd (m, p) = p and gcd(m,q) = 1. If 
gcd(m, p) = 1 , then by Fermat’s Theorem, m p ~ x = 1 (mod p) =4> = 

1 (mod p) =4> = m ( mo d j?). n ow if gcd(m, p) = p , then also the 

above equality holds as both sides are equal to 0 modulo p. Hence in both the cases, 
m ed = m (mod p). By the same argument it follows that m ed = m (mod q). Fi- 
nally, since p and q are distinct prime numbers, it follows that m ed =m (mod n) 
and hence c d = (m e ) d = m (mod n). Hence the result follows. 

Note It is currently difficult to obtain the private key d from the public key (n,e). 
However, if one could factor n into p and q , then one could obtain the private 
key d. Thus the security of the RSA system is based on the assumption that factor- 
ing is difficult. The discovery of an easy method of factoring would “break” RSA 
cryptosystem. 

Remark The encryption function e^im) defined by ^^(m) = m e (mod n), for 
all m e V and Kef C, is a group homomorphism, as ^(^ 1 ^ 2 ) = (m\m 2 ) e 
(mod n) = eK(rn\)eK(yn2)- For this reason, RSA is known as homomorphic en- 
cryption scheme. 


12.6.3 Some Attacks on RSA Cryptosystem 

To understand some of the attacks on RSA cryptosystem, let us first review some 
of the properties of this cryptosystem. First of all this cryptosystem is deterministic 
or mono-alphabetic, i.e., the same plain text message will always be encrypted or 
mapped to the same cipher text. Further, the RSA encryption function is homomor- 
phic, i.e., eK(x • y) = eK(x)eK(y ), for all v e V and for all K e JC. Based on these 
properties, we provide some of the following attacks on RSA cryptosystem. 

• Encrypting short messages using small encryption exponent e\ Suppose for the 
RSA encryption, we are going to encrypt a short message m using a small en- 
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cryption exponent e. Let us illustrate this example for a particular small e assume 
e to be 3 and the short message m is such that m < ft 1 / 3 . The only information, 
regarding the plain text /ft, that is known to the adversary is that m < ft 1 / 3 . In that 
case, a very practical attack can be made by the adversary. Note that major advan- 
tage for the adversary is that the encryption of m does not involve any modular 
reduction since the integer /ft 3 is less than ft. As a result, given the cipher text 
c = m 3 (mod ft), the adversary can determine m by simply computing m = c 1 / 3 
over the integers. 

• Encrypting same messages using same small encryption exponent e\ The above 
attack illustrates that short messages can be recovered easily from their small 
encryption exponent. Now we extend the attack to the case of arbitrary length 
messages as long as the same message m which is sent to multiple receivers 
with the same small encryption exponent e but with different values of ft. Let 
us illustrate the attack with small exponent e = 3 . Let the same message m be 
sent to three different receivers using the three public keys pk\ = (ft i, 3 ), pk2 = 
(ft2, 3 ) and pk 3 = (ft 3, 3 ), respectively, where ft; = piqi , pi and qi are primes, i = 
1 , 2 , 3 . Then the cipher texts are c\ = m e (mod n\), C2 = m e (mod ft2) and C3 = 
m e (mod ft3). As the cipher texts reach the receiver through insecure channels, 
the adversary can get hold of these cipher texts. Note that we may assume that 
gcd(ft; , fty) = 1, for i j, otherwise, the adversary can decrypt the cipher text c; 
to get the plain text m by factorizing ft; to get pi and g; . Let n = n \n2n2 . Then by 
the Chinese Remainder Theorem, the adversary can find a unique non-negative 
value c < ft such that c = c\ (mod n 1), c = C2 (mod ft2) and c = c 3 (mod ft3), 
i.e., c = /ft 3 (mod n 1), c = ft? 3 (mod ft2) and c = m 3 (mod ft3). As n\, ft2 and 
ft 3 are pairwise relatively prime, we have c = m 3 (mod ft). Observe that m < ft; 
for all i = 1 , 2 , 3 . Hence /ft 3 < ft. Thus using a similar technique as described in 
the last paragraph, the adversary can attack the scheme. 

• Encrypting same message using two relatively prime exponents of same modulus 
ft : Suppose Alice wants to send the same message m to two of her employees, 
say Bobl and Bob 2 , using the public keys (ft, e\) and (ft, ^2) of Bobl and Bob 2 , 
respectively, where gcd(ei , e2) = L Then the adversary can get hold of the cipher 
texts c\ = m ei (mod ft) and C2 = m ei (mod ft) of Bobl and Bob 2 , respectively. 
As gcd(ei, ef) = 1 , by using Extended Euclidean algorithm, the adversary can 
find integers u and v such that ue\ + ve2 = L Then the adversary will compute 
c\ • c\ (mod ft). Note that c\ • c\ (mod ft) = 0 m ) ue \+ ve 2 (mod ft) = /ft (mod ft). 
Thus from c\ and C2, the adversary will get the message m. 

• Bidding attack : Suppose one organization has decided to buy some software. 
Many software companies produce that software product. But the organization 
wants to select the software company through global tendering. That is the orga- 
nization will publish an advertisement in which they will specify the requirement 
of the software and will ask for the software companies to submit quotations 
against that product. The organization will give the project to that software com- 
pany having minimum quotation. As the organization wants open tendering for 
transparency, every body will see what is being sent to the organization. As a re- 
sult, the software companies will not send the quotation (only the amount, e.g., 



536 


12 Introduction to Mathematical Cryptography 


20000000) as plain text form. So they must encrypt the quotation and will send 
the cipher text of that quotation. To get uniformity as well as high security, the 
organization decides to use RS A encryption scheme with large primes p and q . 
Each software company will get the public key ( n , e) to encrypt their quotations. 
Until now it looks like the system is quite safe and useful. However, we shall show 
that due to the homomorphic property of the encryption, the RSA encryption is 
very much unsafe for the bidding or auction purpose. We shall show how an ad- 
versary can submit a quotation which is a modification of some valid quotation 
of some software company. Suppose initially there are k software companies who 
submitted their cipher text of the quotations, say c\, C 2 , . . . , cjt corresponding to 
the quotations respectively. As the cipher texts will come to the 

organization through an insecure channel or the organization may keep the cipher 
texts public, the adversary will always be able to get hold of these cipher texts. 
The aim of the adversary is to submit quotations c\ which is, say 10 % less than 
the original quotation m*, for all i = 1, 2, . . . , k. One thing we assume that mi s 
are all multiples of 10. This assumption is quite justified as usually the amount in 
the quotations are multiples of 10. Thus to attack the scheme, the adversary will 
do the following: 

- collect all the cipher texts q corresponding to the plain text mi, i = 1,2, 

- construct c\ — c\ • (90)^ • (100) _£? , i = 1, 2, . . . , k. 

- send c' i to the organization, i = 1, 2, . . . , k. 

Now let us see what is actually happening. Note that c\ = c\ • (90)^ • (100) -£? = 
(m • 90 • (lOO) -1 )^. As p and q are large primes, gcd(100, n) = 1. Thus the inverse 
of 100 in the ring Z n exists and c\ will be simply the cipher text of the plain text 
m- which is the 10 % decrease of the value of m*. Thus min{ . . . , m' k } < 
min{mi, m 2 , . . . , m&}. As a result, the adversary will always get the project. The 
similar attack may also be done in case of any on-line auction using RSA encryp- 
tion scheme in which the adversary can always submit a price which is higher 
than all the actually submitted prices. 

In the next section, we are going to describe another public-key cryptosystem 
based on discrete logarithm problem. 


12.7 ElGamal Cryptosystem 

Though Diffie-Hellman key exchange algorithm provided the first direction towards 
the invention of public-key cryptosystem, it did not achieve the full goal of being a 
public-key cryptosystem. The first public-key cryptosystem, as mentioned in the last 
section, was the RSA system. However, though RSA was historically the first one, 
the most natural development of a public-key cryptosystem based on the discrete 
log problem and Diffie-Hellman key exchange was developed by ElGamal (1985). 
In this section we are going to describe the version of the ElGamal cryptosystem 
which is based on the discrete logarithm problem over the finite field Z p . Note that 
the similar construction can also be made generally using the DLP in any group. 
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Table 12.4 Algorithm for 
ElGamal public key 
cryptosystem 


• Key Generation Algorithm by Bob: 

1. Generate a large prime number p. 

2. Find a primitive element say g of Z* . Bob may take help 
of some trusted third party in generating the prime p and 
the primitive element. 

3. Choose a,l <a < p—l and compute X = g a (mod p). 

4. Keep a as a secret key and make public the entities g, p, 
and X as public keys. 

• Encryption Algorithm for Alice: 

1. Obtain Bob’s public key. 

2. Represent the message as an integer m in the interval 
[1.P-1]. 

3. Choose a random ephemeral key k e Z p -\. 

4. Compute c\ = g k (mod p) and C 2 = mX k (mod p). 

5. Send the cipher text (ci , C 2 ) to Bob. 

• Decryption Algorithm for Bob: 

1. To obtain the plain text message m, Bob uses his private 
key a to compute m = {c^)~ l • C 2 (mod p). 


12.7.1 Algorithm for ElGamal Cryptosystem 

Suppose Alice wants to send a secret message to Bob using the ElGamal public-key 
cryptosystem. The steps for ElGamal algorithm are described in Table 12.4. 

Informally, the plain text m is “masked” by multiplying it by X k to get C2- 
The value c\ — g k is also transmitted as a cipher text c\ . The receiver Bob who 
knows the secret key a is able to compute X k . Thus Bob can “remove the mask” 
by computing C 2 (c^)~ l (mod p). Thus formally, the ElGamal cryptosystem is 
a five-tuple (V, C, /C, £ , V), where V = Z*, C = Z* x Z*, /C = {(p,g,a, X) : 
X = g a (mod p)} and p is a large prime. The value a is considered as a se- 
cret key and the entities g, p, and X are considered as the public keys. For each 
K = (p, g, a, X) e /C and for a secret random ephemeral key k e Z p _i, the en- 
cryption function ex is defined by = (ci,C 2 ), where c\ = g k (mod p) 

and C 2 = mX k (mod p). For ci,C 2 E Z*, the decryption function is defined as 
dK(ci,c 2 ) = c 2 (cp _1 (mod p). 

Remark 1 The ElGamal cryptosystem will be insecure if the adversary can compute 
the value of a , i.e., if the adversary can compute log^ A, as in that case, the adver- 
sary will decrypt the cipher just like Bob does. Thus a necessary condition that the 
ElGamal cryptosystem to be secure is that the Discrete Logarithm Problem in Z* is 
infeasible. 

Remark 2 Due to the use of random ephemeral key k e Z p ~i, the encryption 
scheme becomes randomized. More specifically, there will be p — 1 many options 
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of the cipher texts that are the encryption of the same plain text, with the result that 
the encryption scheme as poly- alphabetic. 

In the next section we are going to introduce another public-key cryptosystem 
which is a variant of the Rabin encryption scheme. The actual Rabin scheme is very 
similar to the RSA encryption scheme, but with one crucial difference: it is possible 
to prove that the Rabin encryption scheme is CPA-secure under the assumption that 
factoring is hard. This equivalence makes the scheme attractive. 


12.8 Rabin Cryptosystem 

The Rabin cryptosystem was published in 1979 by Michael O. Rabin. The Rabin 
cryptosystem was the first asymmetric cryptosystem for which the security is based 
on the fact that it is easy to compute square roots modulo a composite number N 
if the factorization of N is known, yet it appears difficult to compute square roots 
modulo N , when the factorization of N is unknown. In other words, the factoring 
assumption implies the difficulty of computing square roots modulo a composite. 
However, due to the deterministic encryption nature (i.e., mono- alphabetic nature) 
of the schemes, the actual Rabin encryption scheme was not chosen plaintext attack 
(CPA) secure. But a simple variant of original Rabin encryption provides a CPA 
security. In this section, we shall discuss about a variant of the original Rabin en- 
cryption scheme which is CPA- secure. For detailed security notions, the reader may 
refer the books (Adhikari et al. 2013; Katz and Lindell 2007). 

The preliminaries required to understand the scheme is explained in the next 
section. 


12.8.1 Square Roots Modulo N 

Finding square roots modulo N = pq, where p and q are distinct primes plays an 
important role in constructing the Rabin public-key cryptosystem. The following 
proposition provides a characterization of an element y to be a quadratic residue 
modulo a composite integer N , where A is a product of two distinct odd primes. 

Proposition 12.8.1 Let N = pq where p and q are distinct odd primes and 
y e Z* N . Let ( y p , y q ) denote the corresponding element in Z* x Z*, where y p = 
y (mod p) and y q = y (mod q). Then y is a quadratic residue modulo N if and 
only if y p is a quadratic residue modulo p and y q is a quadratic residue modulo q. 

Proof Let y be a quadratic residue modulo N. Then there exists v e Z* N such 
that x 2 = y (mod N ). Let for x e Z* N , (x p ,x q ) be the corresponding element in 
Z* x Z*, where x p =x (mod p) and x q =x (mod q). Thus (x p , x q ) 2 = (x 2 ,x 2 ) is 
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the corresponding element of x 2 in Z* x Z* , with the result that (x 2 , x 2 ) = ( y p ,y q ), 
which implies that x 2 = y p (mod p) and x 2 = y q (mod q). Thus y p is a quadratic 
residue modulo p and y q is a quadratic residue modulo q . 

Conversely, suppose y p and y q are quadratic residues modulo p and q , respec- 
tively. Then there exist x p and x q in Z* and Z*, respectively, such that x 2 = 
y p (mod p) and x 2 = y q (mod q). Then arguing as above, we can say that there 
exist ijg Z* N such that x 2 = y (mod A), with the result that y is a quadratic 
residue modulo A. □ 

The security of the Rabin cryptosystem is based on the fact that it is easy to 
compute square roots modulo an odd composite number A = pq if the factorization 
of A is known, where p and q are distinct odd primes such that p = q = 3 (mod 4 ) . 
However, there is no efficient method yet known to compute square roots modulo A 
if the factorization of A is not known. In fact, we shall show that computing square 
roots modulo A is equivalent to (in the sense that it is equally hard as) factoring A. 
For that let us prove the following proposition which shows that if we can compute 
the square roots modulo N = pq, where p and q are distinct odd primes, then we 
can find an efficient algorithm for factoring. 

Proposition 12.8.2 Let N — pq, where p and q are distinct odd primes. Let a\ and 
a 2 be such that a 2 = a 2 = b (mod A ), but a\ ^ ±<22 (mod A). Then there exists 
an efficient algorithm to factor A. 

Proof As a 2 = a% (mod A), A divides (a\ + <22)^1 — af). Thus p divides (a\ + 
a 2 )(a\ — ^2) and hence p divides one of (a\ +<22) or (a\ — 02). Let p divide a\ +02. 
The proof is similar if p divides <21 — <22. We claim that q does not divide <21 + 02, 
as otherwise, A will divide a\ + <22, contradicting the fact that a\ ±<22 (mod N). 
Thus q does not divide a\ + <22 and hence gcd(A, <21 + <22) = P- As the Euclidean al- 
gorithm to compute gcd of two integers is an efficient one, we can find p efficiently, 
thereby factor N efficiently. □ 

Proposition 12.8.3 Let N = pq, where p and q are distinct odd primes of the 
form p = q = 3 (mod 4 ) . Then every quadratic residue modulo N has exactly one 
square root which is also a quadratic residue modulo N . 

Proof As p = q = 3 (mod 4 ), (p 1 ) = (- 1 )^ = (- 1 ) (mod p), (“*) = 

^ 1 / 1 \ / 1 \ 

(— 1 )^“ = (— 1 ) (mod q), where ( ) and ( ) denote respectively the Legen- 
dre symbols of —1 modulo p and q, respectively. Let y be an arbitrary quadratic 
residue modulo A. As A is a product of two distinct primes, y will have four distinct 
square roots modulo A, namely (x p ,x q ), (—x p ,x q ), (x p , —x q ) and (—x p , —x q ). 
We shall show that out of these four square roots modulo A, exactly one of them is 
a quadratic residue modulo A. Let = 1 and = — 1 . Similar arguments are 

applicable for other cases. Then (~* q ) = (~ q )( X q ) = (— 1 )(— 1 ) = 1 , with the result 
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Table 12.5 A variant of 
Rabin public key 
cryptosystem 


• Key Generation Algorithm by Bob: 

1. Choose two k-bit primes p and q such that p = q = 
3 (mod 4). 

2. Compute N = pq. 

3. Publish N as public key and p, q are kept secret as pri- 
vate keys. 

• Encryption Algorithm for Alice: 

1. Collect the public key N. 

2. Let m G {0, 1} be the plain text. 

3. Choose x Er and compute C = (c, c '), where c = 
x 2 (mod N ) and c' = lsb(x) © m, QIZ^ denotes the set 
of quadratic residues modulo N and lsb denotes the least 
significant bit of the binary representation of v . 

4. Send C to Bob. 

• Decryption Algorithm for Bob: 

1. Collect C = sent by Alice. 

2. Compute unique v Er QTZn such that x 2 =c (mod N ). 
[This is possible due to Proposition 12.8.3] 

3. Compute m = lsb(x) © d . 


that —x q is a quadratic residue modulo q. Thus by Proposition 12.8.1, (x p , —x q ) 
is a quadratic residue modulo N. By similar argument it can be shown that each of 
(x p ,Xq), (—Xp,x q ) and (—x p , —x q ) cannot be a quadratic residue modulo N. □ 

Based on the above propositions, we are going to discuss a variant of Rabin 
Cryptosystem in Table 12.5 which can be shown to be chosen plain text attack (CPA) 
secure based solely on the assumption that factorization is hard. As the security 
notions are out of scope for this book, we are not discussing the proof of the CPA 
security. 


12.8.2 Algorithm for a variant of Rabin Cryptosystem 

Suppose Alice wants to send message to Bob. Then a variant of Rabin cryptosystem 
scheme is explained in Table 12.5. 

In the Rabin encryption scheme, receiver is required to compute the modular 
square roots. So in the scheme, we first find an algorithm for computing square 
roots modulo a prime p = 3 (mod 4) and then extend that to compute square 
roots modulo a composite N = pq with known factorization of N , where p = q = 
3 (mod 4). Before doing that let us start with the case for finding square roots mod- 
ulo a prime p = 3 (mod 4). 

Let p = 4k + 3, k — 0, 1, 2, Then is an integer. Let a be a quadratic 

residue modulo p. Then (") = 1. This implies that a ^ = 1 (mod p). Multiplying 
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Table 12.6 Algorithm to 
compute four square roots of 
a e Z* N modulo N = pq 
having the knowledge of 
factorization of N, where 
p = q = 3 (mod 4) 


• Input: Two primes p and q such that N = pq and a 

quadratic residue a e , where p = q = 3 (mod 4) . 

• Method: 

1. Compute a p =a (mod p) and a q = a (mod q). 

2. Using Lemma 12.8.1, compute two square roots x p and 
—x p of a p modulo p. Similarly compute x q and — x q of 
a q modulo q . 

3. Using the Chinese Remainder Theorem, compute the 
four elements x\, X 2 , *3 and X 4 in Z* N correspond- 
ing to the elements (x p ,x q ), (—x p ,x q ), (x p , —x q ) and 
{—x p , —x q ), respectively, in Z* x Z*. 

• Output: Four square roots of a modulo N. 


both sides by a , we get a = 2 +1 = a 2k+2 = {a k+x ) 2 (mod p). Thus a k+l = 

p + 1 7,1 

a 4 (mod p ) is a square root of a modulo p. The other square root being — a K ^ 
(mod p). This leads to the following lemma. 

Lemma 12.8.1 Let a be a quadratic residue modulo a prime p, where p is a prime 

of the form 4£ + 3, k = 0, 1,2, Then a k+l (mod p) and —a k+l (mod p) are 

two square roots of a modulo p. 

Proof As p is a prime, a has exactly two distinct square roots modulo p and if x is 
a square root modulo p , then — x is also a square root modulo p. □ 

Using Lemma 12.8.1 and the Chinese Remainder Theorem, we can describe the 
algorithm as in Table 12.6 to compute square roots of a modulo N = pq provided 
the factorization of N is known, where p and q are primes of the form p = q = 
3 (mod 4). 


12.9 Digital Signature 

In general, a “conventional” handwritten signature of a person attached to a docu- 
ment is used to specify that the person is responsible for the document. In our daily 
life, we use signature while writing a letter, withdrawing money from bank through 
cheque or signing a contract, etc. We all have heard about signature, but what is then 
“digital signature”? Before we come to the concept of digital signature, let us first 
explore some of the examples from real life to review the concept of signature. Sup- 
pose Alice wants to sell her house for $125,000 and Bob wants to buy the house. 
Bob receives a signed letter from Alice mentioning that the price of the house is 
$125,000. When Bob reads the signed letter, he knows that this letter is indeed from 
Alice, that is, as long as it is not somebody else forging Alice’s signature. Now the 
question is, how can Bob be absolutely sure that the signed letter comes from Alice 
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and it is not a forgery? It may happen that somebody else, say an adversary, sends 
the letter to Bob in the name of Alice by copying her signature while actually Alice 
wanted to sell her house for $225,000. It may further happen that Alice may change 
her mind to sell her house for $325,000 and when Bob comes with the letter from 
Alice, she may claim that it must be a forgery: she never signed such a letter. 

These lead to the following important features of signature: 

• Authenticity : It convinces Bob that the signed letter indeed comes from Alice. 

• Unforgeability : Nobody else but Alice could have signed the message. 

• Not reusability : The same signature cannot be attached to another message. 

• Unalterability : Once a message has been signed, it cannot be altered i.e., if Alice 
signs on a message m and then claims the same signature s for different message 
M, then there should be some mechanism through which Bob can check the cor- 
rectness. As a result, no one will be able to use the same signature s for different 
messages. 

• Non-repudiation: After signing a message, Alice cannot deny that she signed the 
message. 

Now we are going to define a digital signature scheme. Public-key cryptosystem 
plays an important role in constructing and defining digital signature. Informally, 
any signature scheme consists of two components: one is a signing algorithm and 
the other is a verification algorithm. Suppose Alice wants to send a digitally signed 
message to Bob. Alice can sign a digital message m by using a signing algorithm 
sig sk which is based on a secret key sk, known only to Alice, to produce a digital 
signature sig sk (m) for the message m. The resulting signature sig sk (m) can sub- 
sequently be verified by Bob using a publicly known verification algorithm ver pk 
based on the public key pk. Thus given a message m and a purported signature s on 
m, the verification algorithm returns “true” if s is a valid signature on m, else it re- 
turns “no”. The formal definition of a digital signature scheme is defined as follows. 


Definition 12.9.1 A digital signature scheme is a five-tuple (V, A, /C, S , V), where 

the following conditions are satisfied: 

• V is a finite set of possible messages. 

• A is a finite set of possible signatures. 

• 1C is a finite set of possible keys. 1C is known as keyspace. 

• For each key K = (pk, sk) e 1C, there is a signing algorithm sig sk e S based on 
the private key sk and a corresponding verification algorithm ver p k e V based 
on the public key pk. Each sig sk : V — > A and ver pk : V x A — > {true, false} are 
functions such that for all (m,s) eV x A, the following conditions are satisfied: 


ver pk (m, s) 


true if s = sig sk (m) 
false if s / sig sk (m). 


The pair (m, s) is called a digitally signed message. 
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Remark Let us now discuss some of the differences between the conventional pen- 
and-ink signature and the digital signature. In case of a traditional pen-and-ink sig- 
nature, the signature is an integral part of some document, while in case of digi- 
tal signature, the digital signature is not attached physically to the message that is 
signed. In that case, some algorithm is required to somehow “bind” the digital signa- 
ture with the message. Another difference between these two is from the viewpoint 
of verification. In case of traditional signature, the validity of the signature is verified 
by comparing it with other stored authentic signatures. Clearly, this method is not 
secure as someone may forge someone else’s signature. On the other hand, digital 
signatures can be verified by any one using the publicly verifiable algorithm. One of 
the major characteristics of digital signature is that the digital copy of signed digital 
message is exactly identical with the original one, while a photocopy of a pen- 
and-ink signature on some document can usually be distinguished from the original 
document. As a result, in case of digital signature, a certain care must be taken so 
that a same digital signature on some document should not be used more than once. 
For example, suppose Alice signs digitally a cheque of some bank to issue $2000 
to Bob. So Alice must take some measure while digitally signing the message, i.e., 
must put some date or some thing else so that Bob will not be able to reuse the same 
digitally signed cheque more than once. 

Now let us explore a digital signature scheme which is based on RSA public-key 
cryptosystem. 


12.9.1 RSA Based Digital Signature Scheme 

In Sect. 12.6, we have already explored about the celebrated RSA algorithm for 
public-key cryptography. This RSA scheme may also be used to produce a very 
simple signature scheme, known as RSA-based digital signature scheme which is 
almost the reverse of the RSA public-key cryptosystem. Let us explain the algo- 
rithm. 

Suppose Alice wants to sign digitally a message m and wants to send it to Bob. 
The algorithm is given in Table 12.7. 

Remark 1 Any one can forge the RSA digital signature of Alice by choosing a ran- 
dom s e Z„ and then computing m = s e (mod n). Then clearly, s = sigj(m) is a 
valid signature on the random message m. However, not yet, any efficient method 
is found through which we can first choose a meaningful message m and then com- 
pute the corresponding signature y. In fact, it can be shown that if this could be 
done, then the RSA cryptosystem is insecure. Further, due to the homomorphic na- 
ture of the RSA cryptosystem, it is vulnerable to existential forgery in which it is 
possible to generate legitimate message and signature without knowing the private 
key d. For example, if (mi, s\) and (m 2 , S 2 ) are two pairs of signed messages, then 
(mi • m 2 , s\ • ^ 2 ) is also a legitimate signed message. 
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Table 12.7 Algorithm for RSA based signature scheme 

• Key Generation Algorithm by Alice: 

1. Generate two distinct large prime numbers p and q, each roughly of same size. 

2. Compute n = pq and 4>(n) = (p — l)(q — 1). 

3. Select a random integer e with 1 < e < 4>{n), such that gcd(e, 4>{n)) = 1. 

4. Use the extended Euclidean algorithm to find the integer d, 1 < d < 4>{n), such that 
ed = 1 (mod 4>{n)). 

5. Make public the values of n and e (which are known to every body and hence considered 
as public keys) and keep secret the private keys p, q, and d (which are known only to 
Alice). 

• Signing Algorithm for Alice: 

1. Represent the message as an integer m in the interval [0, n — 1]. 

2. Compute the signature sig^(ra) =s = m d (mod n). 

3. Send the pair (m, s) to Bob. 

• The Verification Algorithm for Bob: 

1. Obtain the pair ( m,s ), the public keys n and e from Alice. 

2. Compute ver e>n (m, s). If m = s e (mod n ) i.e., if ver e>w (m, s ) = true, Bob accepts the 
signature, else rejects. This works because s e = m ed (mod n) =m (mod n). 


Remark 2 There has been a rich literature on various security notions of different 
types of signature schemes which are out of scope for this book. The reader may read 
the book (Katz 2010) for detailed and systematic analysis of signature schemes. 

In the next section, we are going to discuss about a very important primitive 
known as oblivious transfer that plays an important role in modern cryptography. 


12.10 Oblivious Transfer 

Suppose Alice and Bob are very shy. However, they would like to know whether they 
are interested in each other or not. So one simple way could be a direct approach to 
each other. If both of them will show their interest towards each others, there will 
not be a problem. But if one of them will refuse, it could be a problem for the other 
due to face saving. So they need some protocol in which they will be able to figure 
out if they both agree but in such a way that if they do not then any of them who has 
rejected the matter has no clue about the other’s opinion. Moreover they may want to 
be able to carry out this protocol publicly over a distance. This problem is known as 
dating or matching problem. At this point of time, it looks quite complicated. Well, 
both Alice and Bob may choose some trusted third party to whom both of them may 
confer their decisions and the trusted third party will declare whether both of them 
agree or not by keeping the decision of the individual secret for ever. But finding 
such a trusted third party is very difficult. However, for this kind of situation, an 
important primitive of cryptography, known as oblivious transfer, could be a useful 
tool. 
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Informally, an oblivious transfer is a protocol between two players, a sender S 
and a receiver 7 Z, such that it achieves the following: the sender S has two secret 
input bits bo and b\ while the receiver 1Z has a secret selection bit s. The protocol 
consists of a number of exchanges of information between the sender S and the 
receiver 1Z. At the end of the protocol, the receiver 7 Z obtains the bit b s but without 
having obtained any information about the other bit b\- s . This is known as sender’s 
security. On the other hand, the sender S does not get any information about the 
selection bit s chosen by the receiver TZ. This is known as the receiver’s security. 

Let the output of an oblivious transfer protocol be denoted by OT (bo, b\ , s) = b s 
for the inputs of the secret bits bo and b\ of the sender S and the secret selection bit 
s of the receiver TZ. Observe that b s is actually equal to (1 0 s) • bo 0 s • where 
“©” denotes the addition of two bits modulo 2 and denotes the binary “AND” 
of two bits. Note that if s = 0, then OT (bo, b\ , 0) = (1 © 0) • bo 0 0 • b\ = bo and if 
5 = 1, then OT(bo, b\, 1) = (1 0 1) • bo 0 1 • b\ = b\. 

Now we again come to the problem of dating as discussed at the beginning of 
this section. To apply the oblivious transfer, let us first model this problem math- 
ematically. Suppose, there are two players, say a Sender S and a Receiver 1Z. In 
the context of the dating problem, we may assume Alice to be Sender and Bob to 
be Receiver. As the secret decisions of each of these players are either yes or no, 
we may assume that each of them has a secret bit, where 1 stands for “yes” and 0 
stands for “no”. Suppose the Sender S has the secret bit a and the Receiver TZ has 
the secret bit b. They actually want to compute a • b, i.e., the logical “AND” of the 
bits a and b such that the following conditions are satisfied: 

• None of the sender or receiver is led to accept a false result (this property is known 
as correctness). 

• Both the sender and the receiver learn the result a • b (this property is known as 
fairness). 

• Each learns nothing more than what is implied by the result and the own input 
(this property is known as privacy). For example, at the end of the oblivious trans- 
fer protocol for the dating problem, both Sender and the Receiver will learn a • b. 
So for the sender (Alice), if a = 1 and she comes to know that a • b = 1, then she 
gets the information that the secret bit b for the Receiver (Bob) must be 1, while 
if the secret bit a of the Sender (Alice) is 0 then a • b is always 0, irrespective of 
the input bit b of the receiver (Bob). Therefore the Receiver’s (Bob’s) choice for 
b remains unknown to the Sender’s. A similar argument is also applicable when 
b = 0. 

Now we are in a position to solve the dating problem by using an Oblivious 
Transfer protocol. Note that we have not yet said any thing about the existence of 
such Oblivious Transfer protocol. In the next section, we are going to provide an 
Oblivious Transfer protocol. Before that, assuming the existence of such Oblivious 
Transfer protocol, we are going to provide the following protocol that will solve 
the dating problem: Consider Alice to be the Sender and Bob to be the Receiver 
of an Oblivious Transfer protocol with the inputs for Alice to be the secret bits 
bo = 0 and b\— a while the secret selection bit for Bob to be 5 = b. Then we have 
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OT (0, a, b) = (b ® 1) • 0 ® b • a = b • a = a • b. As an output of the Oblivious Transfer 
protocol, Bob receives a • b. Finally, Bob sends a • b to Alice so that both of them 
learn the result a • b. 

Now let us see what are the conditions required to make the protocol secure. Note 
that to achieve the correctness and fairness, we have to assume that Bob indeed sends 
the correct value a-b to Alice. So we have to assume that both Alice and Bob follow 
the rules of the game but may try to learn as much as possible about the inputs of 
the other players. We call such players as semi-honest players. Further, note that 
if b = 0, then a-b is always 0, no matter what the value of the secret input a of 
Alice is. Thus the player Bob learns nothing more about a by the properties of the 
Oblivious Transfer protocol. On the other hand, if a = 0, then again from the fact 
that the Oblivious Transfer protocol does not leak information about b and the fact 
that in this case Bob returns 0 to Alice in the final step no matter what b is, Alice 
learns nothing more about b as a result of the complete protocol. 

Assuming the existence of an oblivious transfer protocol, we have solved the 
dating problem. Now we are going to provide an oblivious transfer protocol based 
on RSA cryptosystem, as described in Sect. 12.6. 


12.10.1 OT Based on RSA 

To construct the RSA based oblivious transfer protocol, we assume that both the 
players, i.e., the Sender and the Receiver are semi-honest. It has been proved by 
Alexi et al. (1988) that in case of RSA cryptosystem, if a plain text m is chosen at 
random, then given just the cipher text c = m e (mod n) along with the values of 
n and e , guessing the least significant bit of m i.e., lsb(m) significantly better than 
at random is as hard as finding all bits of m. This is called a “hard-core” bit for 
the RSA encryption function. The hardness related to the hard-core bit is actually 
exploited to obtain an oblivious transfer by masking the two secret bits bo and b\ of 
the Sender. Here the Sender has two secret bits bo and b\ while the Receiver has one 
selection bit s . The Sender first starts the key set up steps for the RSA encryption 
by choosing two distinct primes p and q and compute n = p • q. For the public 
key, the Sender chooses a random e such that gcd(c, 4>(n)) = 1, where 0 denotes 
the Euler phi function. Then for the decryption, the Sender finds d such that e d = 
1 (mod n). Finally, the sender makes public the values of c, n, and keeps secret the 
values of d , p , and q. After the key setup step, the Sender sends the public values 
to the Receiver. The Receiver, having selection bit s, chooses a random plain text 
m s (mod n ) and computes the cipher text c s = m e s (mod n). The Receiver further 
selects a random integer c\- s modulo n and consider it as another cipher text. Note 
that the Receiver can easily find the lsb(m 5 ) while due to the hardness of hard-core 
bit for RSA encryption function, determining the lsb(mi_ 5 ) from c\- s is infeasible. 
The Receiver sends the cipher text pair (co, c\) to the Sender. As the Sender has 
the secret key for decryption, she can decrypt the cipher text pair (co, c\) to get 
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Table 12.8 Algorithm for OT based on RSA 

Algorithm for Sender: 

• Choose two distinct large primes p and q. 

• Compute n = pq. 

• Select a random integer e, 1 < e < 4>{n ), such that gcd(e, 4>{n)) = 1. 

• Find the integer d, 1 < d < (p(n), such that ed = 1 (mod 0(n)). 

• Send the public values n and e to the receiver. 

• Keep secret the values p, q, and d. 

• The Sender has two secret bits bo and b \ . 

Algorithm for Receiver: 

• Collect the public values n and e. 

• The Receiver has a secret selection bit s. 

• Choose a random plain text m eZ n and compute the cipher text c s = m e (mod n). 

• Select c\- s randomly from Z n as another cipher text. 

• Send the ordered pair (c s , c\- s ) to the Sender. 

Algorithm for Sender: 

• Decrypt (co, c\ ) to get (mo, mi). 

• Compute ro = lsb(mo) and ri = lsb(mi). 

• Mask the bits bo and b\ by computing b' Q = bo © ro and b[ =b\(&r\ and send (b' 0 , b\) to 
receiver. 

Algorithm for Receiver: 

• Recover b s by computing b' s © r s . 

• The bit b\- s remains concealed since he cannot guess r\- s with high enough probability. 


the plain text (mo, mi). Let ro = lsb(mo) and ri = lsb(mi). Then the Sender masks 
the bits bo and b\ by computing b^ = bo 0 ro and b[ =b\(&r\ and sends (Z?q, b[) 
to the Receiver. The Receiver recovers b s by computing b' s ® r s . Note that the bit 
b\- s remains concealed since the Receiver cannot guess r\- s with high enough 
probability. Note that the selection bit s is unconditionally hidden from the Sender 
and security of the Sender is guaranteed due to the assumption that the receiver is 
semi-honest. 

The actual algorithm is given in Table 12.8. 

In the next section we are going to introduce a very important concept of cryp- 
tography, known as secret sharing. 


12.11 Secret Sharing 

Due to the recent development of computers and computer networks, huge amount 
of digital data can easily be transmitted or stored. But the transmitted data in net- 
works or stored data in computers may easily be destroyed or substituted by enemies 
if the data are not enciphered by some cryptographic tools. So it is very important to 
restrict access of confidential information stored in a computer or in a certain nodes 
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of a system. Access should be gained through a secret key, password or token. Again 
storing the secret key or password securely could be a problem. The best solution 
could be to memorize the secret key. But for large and complicated secret key, it is 
almost impossible to memorize the key. As a result, it should be stored safely. While 
storing data in a hard disk, the threats such as troubles of storage devices or attacks 
of destruction make the situation even worse. In order to prevent such attacks, we 
may make as many copies of the secret data as possible. But if we have many copies 
of the secret data, the secret may be leaked out and hence the number of the copies 
should be as small as possible. Under this circumstances, it is desirable that the se- 
cret key should be governed by a secure key management scheme. If the key or the 
secret data is shared among several participants in such a way that the secret data 
can only be reconstructed by a significantly large and responsible group acting in 
agreement, then a high degree of security is attained. 

Shamir (1979) and Blakley (1979), independently, addressed this problem in 
1979 when they introduced the concept of a threshold secret sharing scheme. 
A ( t,n ) threshold scheme is a method whereby n pieces of information, called 
shares , corresponding to the secret data or key K , are distributed to n participants 
so that the secret key can be reconstructed from the knowledge of any t or more 
shares and the secret key cannot be reconstructed from the knowledge of fewer than 
t shares. 

But in reality, there are many situations in which it is desirable to have a more 
flexible arrangement for reconstructing the secret key. Given some n participants, 
one may want to designate certain authorized groups of participants who can use 
their shares to recover the key. This kind of scheme is called general secret sharing 
scheme. 

Formally, a general secret sharing scheme is a method of sharing a secret K 
among a finite set of participants V = {Pi , P 2 , . . . , P n ] in such a way that 

1. If the participants in A c V are qualified to know the secret, then by pooling 

together their partial information, they can reconstruct the secret K , 

2. Any set B C V which is not qualified to know K , cannot reconstruct the secret K . 

The key is chosen by a special participant V, called the dealer and it is usually 
assumed that V £ V. The dealer gives partial information, called share or shadow , 
to each participant to share the secret key K. The collection of subsets of participants 
that can reconstruct the secret in this way is called access structure r. T is usually 
monotone, that is, if X e r and X c X' c V, then X' e T. A minimal qualified 
subset Y e T is a subset of participants such that Y' £ T for all Y' cY. The basis 
of r, denoted by 7~b, is the family of all minimal qualified subsets. A secret sharing 
scheme is said to be perfect if the condition 2 of the above definition is strengthened 
as follows: Any unauthorized group of shares cannot be used to gain any information 
about the secret key that is if an unauthorized subset of participants B C V pool their 
shares, then they can determine nothing more than any outsider about the value of 
the secret K . 
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12.11.1 Shamir’s Threshold Secret Sharing Scheme 


Shamir’s original ( t , n ) -threshold scheme for n >t >2, given in Shamir (1979), is 
based on polynomial interpolation of t points in a two dimensional plane. The basic 
intuition behind the scheme is that it requires two points to define a unique straight 
line that passes through these two pints, it requires three points to fully define a 
unique quadratic passing through these three points, it requires four points to fully 
define a unique cubic and so on. Thus in general, one can fit a unique polynomial of 
degree k — 1 to any set of k points that lie on the polynomial. To construct a (t, n)- 
threshold scheme, the dealer is going to choose a polynomial fix ) of degree t — 1 
in a two dimensional plane. Note that any t points on the curve fix) can determine 
the curve fix) uniquely, i.e., if we are given t or more points on the curve fix ), 
we can uniquely reconstruct fix), while there will be infinitely many curves which 
pass through any t — 1 or less points on the curve fix). So t — 1 or less number of 
points on the curve fix) cannot reconstruct the curve fix) uniquely. Thus n points 
on the t — 1-degree curve fix) = ao + a\x + a 2 X 2 + • • • + a t ~ \x t_l , with ao = 
secret , are chosen by the dealer for the n participants. To each participant, a point 
is given. Thus when t or more participants come together, they can reconstruct the 
polynomial fix) to get the constant coefficient ao = secret. But for any set of t — 1 
or less participants come together, they cannot reconstruct ao uniquely. In fact they 
will have infinitely many choices for ao. This intuitive idea helps us in building 
Shamir’s (t , n) -threshold secret sharing scheme. Instead of taking a two dimensional 
Euclidean plane, Shamir proposed the scheme over a finite field GF[q ] having q 
elements such that the secret k and the number of participants n is less than q . Let 
fix) be a polynomial of degree at most t — 1 over the finite field GF[q] having q 
elements. Assume that, for j (1 < j < t) distinct elements xj of GF[q ], the values 
of f(xj) are known. Hence the system of t linearly independent equations 


t - 1 

f(xj) = 'Y^a i x l j , 

(=0 

in t unknowns ao, a\, . . . , a t -\, can be obtained. Lagrange’s interpolation can now 
be used to determine uniquely the t unknowns. So the polynomial can be recovered 
from the t points. Shamir (1979) used this property of polynomial interpolation to 
construct a t-out-of-n threshold scheme. 

Lor simplicity, take GF[q] to be Z q , for some large prime q such that n < q. 
This set Z q is the key space, i.e., the secret key K is to be chosen from Z q . In 
Shamir’s scheme a secret K from Z q and then a polynomial fix) of degree t — 1 are 
chosen by the Dealer in such a way that the constant term of the polynomial is K , 
i.e., f(0) = K. The participants are labeled P\, P 2 , . . . , P n - The dealer chooses n 
distinct non-zero elements from Z q , say x\, X 2 , . . . , x n (that is why q > n + 1). 
These x\ s are made public. Lor i — 1, 2, . . . , n, participant Pf is given the values 
fixi) and Xi i.e., a point (jq, f(xi)) on the curve fix), as a share. When any t 
participants come together, they can use their shares to recover fix) and hence 
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reconstruct the secret K as follows: Suppose the t participants Pq , P* 2 , . . . , Pq want 
to find the unknown value K. Each participants Pi k has the knowledge of Xi k and 
f(xi k ). Since the secret polynomial is of degree at most t — 1, the participants may 
assume the form of the polynomial as 

A(x) = ao + a\x + ci 2 X 2 + b a t -\x t ~ 1 , 

where af s are unknown elements of Z q with ao = K, which is actually retrieved 
as a key by the qualified set participants. The aim of the participants is to find ao 
from their shares. Now f(xi k ) = A(jt^), 1 <k<t. When the t participants come to- 
gether, they can obtain t linear equations in the t unknowns ao, a \, . . . , a t ~\, where 
all the operations are done in Z q . Now if the equations are linearly independent, the 
participants will get a unique solution to the system of equations and ao = K will 
be revealed as the key. We show using Vandermonde matrix that these system of 
equations has always a unique solution. Note that this system of equations can be 
written as 

ao + a\Xi x + + • • • + &t- \ x \ x 1 = / ( x i \ ) 

ao + a\Xi 2 + a 2 xf 2 H f a t -\xj 2 1 = / (x/ 2 ) 

a 0 + aix it + a 2 xf t H f a t -]_x\~ x = f(x it ). 

This can be written in a matrix form as follows: 



The coefficient matrix is known as Vandermonde matrix and its determinant is of 
the form 

]T] (*<* ~ x ij) (mod q). 

1 <j<k<t 

Since Z q is a field and x; ’s are all distinct, the determinant of the coefficient matrix 
is always non-zero and hence the system of equations has a unique solution. This 
shows that any group of t participants can recover the key in this threshold scheme. 

Now we will show what will happen if a group of t — 1 or less participants try to 
compute the secret key K ? Assume that t — 1 participants wish to collaborate and 
try to guess the secret. The t — 1 participants generate a set of t — 1 equations in t 
unknowns. These equations have as their solution a set of q polynomials 

t - 1 

f(x)=a 0 + 7>x', 

(=1 
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where ao ranges over all the elements of Z q . Hence each of the possible keys is 
equally likely. This proves that the scheme is perfectly secure. 

Remark Note that Shamir’s secret sharing scheme is a single use secret sharing 
scheme for single secret. For multi-use, multi-secret sharing schemes, the reader 
may refer to Das and Adhikari (2010). 

In the next section, we are going to discuss about a special kind of secret sharing 
scheme, known as visual cryptography. 


12.12 Visual Cryptography 

Most of the secret sharing schemes are based on algebraic calculations in their 
realizations. But there are some different realizations from ordinal secret sharing 
schemes. Visual cryptography is one such secret sharing scheme. In visual cryptog- 
raphy, the problem is to encrypt some written material (handwritten notes, printed 
text, pictures, etc.) in a perfectly secure way in such a manner that the decoding may 
be done visually, without any cryptographic computations. The concept of visual 
cryptography was first proposed by Naor and Shamir (1994). Visual cryptographic 
scheme for a set V of n participants is a cryptographic paradigm that enables a secret 
image to be split into n shadow images called shares , where each participant in V 
receives one share. Certain qualified subsets of participants can “visually” recover 
the secret image with some loss of contrast, but other forbidden sets of participants 
have no information about the secret image. The collection of all qualified subsets is 
denoted by 7~Q ua i and the collection of all forbidden subsets is denoted by /port*- The 
pair (7~Quai, Tporb) is called the access structure of the scheme. A participant P e V 
is an essential participant if there exists a set X c V such that X U {F} e /Q ua i 
but X £ 7q ua i. Typically, in a (k, ft) -Visual Cryptographic Scheme ((k,ft)-VCS), 
a page of secret image/text is encrypted to generate n pages of cipher text which 
may be printed on n transparency sheets. If k — 1 or less number of transparency 
sheets are superimposed, it will give no information on the secret image and will 
be indistinguishable from random noise. However, if any k of the transparencies are 
stacked together, the secret image will be revealed. So in a (k, n)-V CS, there is a 
secret image and a set of n persons, called participants. The secret image is split 
into ft shadow images, called shares and each participant receives one share, k — 1 
or less many participants cannot decipher the secret image from their shares but any 
k or more participants may together recover the secret image by photocopying the 
shares given to the participants onto transparencies and then stacking them. Since 
the reconstruction is done by human visual system, no computations are involved 
during decoding unlike traditional cryptographic schemes where a fair amount of 
computations is needed to reconstruct the plain text. 

In this section, we shall discuss different techniques for black and white visual 
cryptography in which the secret is a black and white image made up of black and 
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Share 2 (no information about the secret image) 


Fig. 12.1 (2, 2)-VCS for black and white image 


white pixels. Two parameters are very important in visual cryptography, namely 
the pixel expansion and the contrast. Pixel expansion is the number of pixels, on 
the transparencies corresponding to the shares (each such pixel is called subpixel), 
needed to encode one pixel of the original secret image. On the other hand, the 
contrast is the clarity with which the reconstructed image is visible. 

Now let us explain how a visual cryptographic scheme may be constructed. 


12.12.1 ( 2 , 2)-Visual Cryptographic Scheme 

Suppose we have a secret black and white image i.e., the image is made up of only 
black and white pixels. Now suppose we want to distribute the secret image as shares 
among a set of two participants in such a way that from a single share it is not 
possible to decode the secret image, but if both of the participants come together 
and superimpose their shares they will be able to decode the secret image. This 
scheme is known as the (2, 2)-Visual Cryptographic Scheme (in short (2, 2)-VCS), 
illustrated in Fig. 12.1. 

Naor and Shamir (1994) devised the following scheme for the (2, 2)-VCS. The 
algorithm specifies how to encode a single pixel and it would be applied for every 
pixel in the image to be shared. A pixel P is split into two pixels in each of the two 
shares (each such pixel in the shares is called subpixel). If P is white, then a coin 
toss is used to randomly choose one of the first two rows in Fig. 12.2. 

If P is black, then a coin toss is used to randomly choose one of the last two rows 
in Fig. 12.2. Then the pixel is encrypted as two subpixels in each of the two shares, 
as determined by the chosen row in Fig. 12.2. Every pixels is encrypted using a new 
coin toss. 

Suppose we look at the two subpixels, in the first share, corresponding to the 
pixel P in the secret image. One of these two subpixels is black and the other is 
white. Moreover, each of the two possibilities “black-white” and “white-black” is 
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Fig. 12.2 (2, 2)-VCS for 
black and white image 


pixel 


share # 1 

share # 2 

superimposition of two shares 
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equally likely to occur, independent of whether the corresponding pixel in the secret 
image is black or white. Thus just by looking at the first share it is not possible to 
predict whether the subpixels in the share correspond to a black or white pixel in the 
secret image. The same argument is applicable for the second share also. Since all 
the pixels in the secret image were encrypted using independent random coin flips, 
there is no information to be gained by looking at any group of pixels on a share, 
either. This demonstrates the security of the scheme. 

Now we consider the situation when two shares are superimposed. In Fig. 12.2 
it is shown in the last column. If the pixel P in the original image is black then 
after superimposition of two shares, we get two black subpixels in the superimposed 
image, whereas if P is white then after superimposition of two shares we get one 
white and one black subpixels in the superimposed image. Thus we can say that in 
this particular case the reconstructed pixel has grey level of 1 if P is black and a grey 
level of 1 /2 if P is white. So we see that in case of a white pixel in the secret image, 
there will be 50 % loss of contrast in the reconstructed image, but it should still be 
visible. 

Now we shall explain mathematically the (2, 2)-VCS as described in Fig. 12.2. 
First let us try to explain the phenomenon of superimposition of transparencies 
mathematically. Note that when a black pixel, printed on a transparency, is superim- 
posed over a white pixel, printed on a transparency (actually left blank), as a visual 
effect, we get a black pixel. Similarly, if both the pixels are black, the superimposed 
pixel will look black. The superimposed pixel will look white only when both the 
pixels are white. Now let us denote the superimposition operation by “*”, a white 
pixel by 0 and a black pixel by 1 . Then, as mentioned above, we have 1*1 = 1, 
1*0=1, 0*1 = 1 and 0 * 0 = 0. If we look at the operation * little carefully, it 
reveals that operation * is noting but the Boolean “or” operation. Thus the superim- 
position of two pixels is simply the Boolean “or” operation. 

Now let us explain the share distribution algorithm, as descried in Fig. 12.2, 
mathematically. As mentioned earlier, if the pixel of the secret image is 0, then one 
0 pixel and one 1 pixel is given to the first share holder, while one 0 pixel and one 1 
pixel is given to the second share holder. The same thing may also be done by giving 
one 1 pixel and one 0 pixel to the first share holder, while one 1 pixel and one 0 pixel 
to the second share holder. On the other hand, for a black pixel in the secret image, 
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one 0 pixel and one 1 pixel is given to the first share holder, while one 1 pixel and 
one 0 pixel is given to the second share holder. The same thing may also be done 
by giving one 1 pixel and one 0 pixel to the first share holder, while one 0 pixel and 
one 1 pixel to the second share holder. This phenomenon may be explained by the 
two 2x2 Boolean matrices S° and S l given as follows: 


S° 


0 1 
0 1 


and S l 


0 1 
1 0 


These two matrices are called basis matrices of the (2, 2)-VCS. First we observe that 
the two rows of S° correspond to the “white-black” and “white-black” combinations 
as given in the first line of the third column of Fig. 12.2. Similarly, the two rows of 
S l correspond to the “white-black” and “black- white” combinations as given in the 
third line of the third column of Fig. 12.2. Now if we interchange the columns of S ° , 
we see that the two rows of S° correspond to the “black- white” and “black- white” 
combinations as given in the second line of the third column of Fig. 12.2. Similarly, 
if we interchange the columns of S l , we see that the two rows of S l correspond 
to the “black- white” and “white-black” combinations as given in the fourth line of 
the third column of Fig. 12.2. That is why, S° and S l are called the basis matrices 
because these matrices can construct the (2, 2)-VCS as follows. 

Let the pixel P in the secret image be black. Since P is black, we use S l , the 
basis matrix corresponding to the black pixel. We apply a random permutation on 
the columns of S l and let the resulting matrix be T l . We give the first row of T l 
to the first participant as share and second row to the second participant as share. 
Similar procedure holds if P is a white pixel. Now for the security analysis, if we 
look at any individual share, it may be either “0 1” or “1 0”. But both these patterns 
are present for both the black and white pixels. So just by looking at an individual 
share, it is not possible to predict correctly from which matrix it has come, i.e., with 
probability we can guess from which matrix it has come. Note that any person 
having no share can also guess with probability \ . Thus with only one share, any 
participant will not have any extra privilege over any person having no share. But 
if we superimpose two shares, for black pixel we get “1 1”, while for white pixel 
either we get “1 0” or “0 1”. So we can distinguish the black and white pixels in the 
superimposed image. Thus to construct a VCS, we only need to construct the basis 
matrices S° and S l . 

One thing we note that in the above (2, 2) -VCS, for each pixel we are giving two 
subpixels to each share. So the size of the share will increase. The number of sub- 
pixels required to encode one pixel of the secret image (i.e., the number of columns 
of the basis matrices S° or S l ) is known as pixel expansion. In the above (2, 2)-VCS 
the pixel expansion is 2. Pixel expansion determines the size of the share. In visual 
cryptography we want the pixel expansion to be as small as possible. 

Another parameter is very important in visual cryptography. That parameter is 
known as relative contrast. In order that the recovered image is clearly discernible, 
it is important that the grey level of a black pixel be darker than that of a white pixel. 
Informally, the difference in the grey levels of the two pixels is called contrast. We 
want the contrast to be as large as possible. Here the relative contrast is = 1 /2. 
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n shares 
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arbitrary 
k shares 






No information about 
the secret image 
is obtained 


Fig. 12.3 ( k , n)-VC S for black and white image 


12.12.2 Visual Threshold Schemes 


A (k, n (-threshold structure is any access structure (/Quai, /F 0 rb) in which 

r Q ={BC : V:\B\=k) 


and 

/Forb = {B C.V :\B\ <k — l}. 

In any ( k , n ) -threshold VCS, the image is visible if any k or more participants stack 
their transparencies (as shown in Fig. 12.3), but totally invisible if fewer than k 
transparencies are stacked together or analyzed by any other method. In a strong 
0 k , n ) -threshold VCS, the image remains visible if more than k participants stack 
their transparencies. 


12.12.3 The Model for Black and White VCS 

Let V = {1, . . . , n} be a set of elements called participants and let 2^ denote the set 
of all subsets of V. Let /Q ua i and /porb be subsets of 2^, where /Q ua i H Pp 0 rb = 0. 
We will refer to members of /Q ua i as qualified sets and the members of /p 0 rb as 
forbidden sets. The pair ( TQ ua i , is called the access structure of the scheme. 

In general, rQ ua i U /> or ^ need not be 2^ . 

Let /o = {A | A e /Q ua i A VA r cA,A ; ^ /Quai} be the collection of all minimal 
qualified sets. 

A participant P e V is an essential participant if there exists a set X c V such 
that X U {F} g /Quai but X £ /q u a i. If a participant P e V is not an essential partic- 
ipant, then we call the participant as a non-essential participant. 


556 


12 Introduction to Mathematical Cryptography 


Let r c 2 V \ {0} (r c 2 V ). If A e r and A c A' c V (A' c A c V) implies 
A' e T then r is said to be monotone increasing (decreasing) on V. If 7~Q ua i is 
monotone increasing, /porb is monotone decreasing and /Q ua i U /p or b = 2^, then 
the access structure is called strong and To is called a basis. A visual cryptographic 
scheme with a strong access structure will be termed as a strong visual cryptography 
scheme. In this paper we mostly deal with strong access structures. However, some 
part of the paper deals with some restricted access structure. Throughout this paper, 
we presume that /~Q ua i U /p or b = 2^. So any X c V is either a qualified set or a 
forbidden set of participants. 

We further assume that the secret image consists of a collection of black and 
white pixels, each pixel being shared separately. To understand the sharing process 
consider the case where the secret image consists of just a single black or white 
pixel. On sharing, this pixel appears in the n shares distributed to the participants. 
However, in each share the pixel is subdivided into m subpixels. This m is called 
the pixel expansion i.e., the number of pixels, on the transparencies corresponding 
to the shares (each such pixel is called subpixel), needed to represent one pixel of 
the original image. The shares are printed on transparencies. So a “white” subpixel 
is actually an area where nothing is printed and left transparent. We assume that the 
subpixels are sufficiently small and close enough so that the eye averages them to 
some shade of grey. 

In order that the recovered image is clearly discernible, it is important that the 
grey level of a black pixel be darker than that of a white pixel. Informally, the dif- 
ference in the grey levels of the two pixel types is called contrast. We want the 
contrast to be as large as possible. Three variables control the perception of black 
and white regions in the recovered image: a threshold value (t), a relative contrast 
(< a(m )) and the pixel expansion (m). The threshold value is a numeric value that 
represents a grey level that is perceived by the human eye as the color black. The 
value a(m) • m is the contrast, which we want to be as large as possible. We require 
that a(m) • m > 1 to ensure that black and white areas will be distinguishable. 

Notations Consider an n x m Boolean matrix M and let X c {1,2 , ... ,n}. By 
M[X] we will denote the \X\ x m submatrix obtained from M by retaining only 
the rows indexed by the elements of X. Mx will denote the Boolean “or” of the 
rows of M[X]. The Hamming weight w(V) is the number of l’s in a Boolean vec- 
tor V. 

Definition 12.12.1 Let (/q U3 i, Pp 0 rb) be an access structure on a set of n partici- 
pants. Two collections (multisets) ofnxm Boolean matrices Co and C\ constitute 
a visual cryptography scheme (7~Q ua i, /p or b, w)-VCS if there exist values a(m) and 
Ux}xer Qual satisfying: 

1. Any (qualified) set X = {/i , h, • • • , ip} G /Q ua i can recover the shared image by 

stacking their transparencies. 

(Formally, for any MeCo, Mx the “or” of the rows /i, h, • • • , ip satisfies 

w(Mx) <tx~ ot(m) • m; whereas, for any M e C\ it results in w(Mx) > tx •) 
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2. Any (forbidden) set X = {i\, • • • , ip } c /Forb has no information on the shared 

image. 

(Formally, the two collections of p xm matrices D t , with t e {0, 1}, obtained 
by restricting each n xm matrix in C t to rows i \ , . . . ,i p are indistinguishable 

in the sense that they contain the same matrices with the same frequencies.) 


12.12.4 Basis Matrices 

To construct a visual cryptographic scheme, it is sufficient to construct the basis 
matrices corresponding to the black and white pixel. In the following, we formally 
define what is meant by basis matrices. 

Definition 12.12.2 (Adapted from Blundo et al. (1999)) Let (/q U3 i, /p 0 rb) be an 
access structure on a set V of n participants. A (/Q ua i, /Forb> m)-V CS with relative 
difference a(m) and a set of thresholds {tx}xer Qua \ is realized using the n x m basis 
matrices S° and S 1 if the following two conditions hold: 

1. If X = {/ 1 , i* 2 , . . . , i p ] e /q uai, then 5^, the “6>r” of the rows i 1 , z' 2 , • • • , i P of 5°, 
satisfies w(S^) <tx — ot(m) • m; whereas, for .S 1 it results in w(S^) > tx . 

2. If A = {/ 1 , / 2 , . . . , i p } e Pporb, the two p xm matrices obtained by restricting 5° 
and 5* 1 to rows i\,h, • • • ,i P are equal up to a column permutation. 


12.12.5 Share Distribution Algorithm 


We use the following algorithm to encode the secret image. For each pixel P in the 
secret image, do the following: 

1. Generate a random permutation ir of the set {1, 2, . . . , m}. 

2. If P is a black pixel, then apply n to the columns of S 1 ; else apply n to the 
columns of S°. Call the resulting matrix T . 

3. For 1 < i < n, row i of T comprises the m subpixels of P in the i th share. 


12.12.6 ( 2 , n)-Threshold VCS 


In this section we consider only (2, n)- VCS for black and white images. In Naor and 
Shamir (1994), Naor and Shamir first proposed a (2, ft) -VCS for black and white im- 
ages. They constructed the 2-out-of-ft visual secret sharing scheme by considering 
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the two n x n basis matrices S° and S l given as follows. 


1 

0 

0 . 

• 01 

1 

• o 

• o 

• o 

_1 

• • o 

• • o 

1 

• • o 

ri 

0 

0 . 

1 — 
o 

• o 

1 

• o 

• o 

1 

o • • 

• • o 

• • o 

. 1 _ 


S° is a Boolean matrix whose first column comprises l’s and whose remaining 
entries are 0’s. S l is simply the identity matrix of dimension n. 

When we encrypt a white pixel, we apply a random permutation to the columns 
of S 0 to obtain matrix T . We then distribute row i of T to participant i. To encrypt 
a black pixel, we apply the permutation to S l . A single share of a black or white 
pixel consists of a randomly placed black subpixel and (n — 1) white subpixels. 
Two shares of a white pixel have a combined Hamming weight of 1, whereas any 
two shares of a black pixel have a combined Hamming weight of 2, which looks 
darker. The visual difference between the two cases becomes clearer as we stack 
additional transparencies. 

To exemplify this discussion, let us take a concrete example of a (2, 4) VCS. The 
basis matrices S° and S l in this case are given by 


"1 

0 

0 

0" 


"1 

0 

0 

0 

1 

0 

0 

0 

and S l = 

0 

1 

0 

0 

1 

0 

0 

0 

0 

0 

1 

0 

1 

0 

0 

0 


0 

0 

0 

1 


If one examines just a single share then it is impossible to determine whether 
it represents a share of a black or a white pixel since single shares, whether black 
or white, look alike. If two shares of a black pixel are superimposed together, we 
obtain two black and two white subpixels. Combining the shares of a white pixel 
yields only one black and three white subpixels. Therefore, on stacking two shares, 
a black pixel will look darker than a white pixel. In the above example the pixel 
expansion is 4 and the relative contrast for any two participants is 1/4. 


12.12.7 Applications of Linear Algebra to Visual Cryptographic 
Schemes for (n, n)-VCS 

Let us construct an (h, n)-VC S, i.e., the secret image is distributed among a set 
P = {1,2, of « participants in such a way that if all the n participants su- 
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perimpose their shares, they get back the secret image as a superimposed image, 
but a set of n — 1 or less number of participants will get no information about 
the secret image. So here To = {{ 1 , 2 , ... ,n}}. As we mentioned earlier, to con- 
struct the scheme, it is sufficient to construct the basis matrices. To generate the 
basis matrices, the dealer first associates with each participant i a variable Xj, 
i = 1 , 2 , . . . , n. Then the dealer considers the two following systems of linear equa- 
tions over Z2 : 


fv = 0 ( 12 . 1 ) 

fv = 1 , ( 12 . 2 ) 

where fp=x 1 + X2 + • • • + x n . Clearly, the set of all solutions of ( 12 . 1 ) over Z2 
forms a vector space over Z2. Let S° and S l be the Boolean matrices whose columns 
are just all possible solutions of (12.1) and (12.2) respectively over the binary field. 
Then it can be shown that S° and S l satisfy all the two properties of the Defini- 
tion 12 . 12.1 and thereby form basis matrices of the (n,n)-\ CS having pixel ex- 
pansion m = 2 n ~ l . For more details, the reader may refer to the following theorem 
from Adhikari ( 2013 ). 

Theorem 12 . 12.1 Let V = { 1 , 2 , ... ,n} be a set of n participants. Then there ex- 
ists an ( n,n)-VCS with black and white images having optimal pixel expansion 
m = 2 n ~ l and relative contrast a (m) = l/ 2 n ~ l . 

To understand the scheme, let us consider the following example. 

Example 12 . 12.1 Let us construct a ( 4 , 4 )-VCS scheme. As there are four par- 
ticipants, namely 1 , 2 , 3 , and 4 , we associate a variable x; to the i th participant, 
for i = 1 , 2 , 3 , 4 . Now we consider the following two systems of linear equations 
over Z2: 


x\ T X2 T X3 T x\ — 0 ( 12 . 3 ) 

XI +X2 + X3 +X4 = 1 . ( 12 . 4 ) 

Let S° and S l be the Boolean matrices whose columns are just all possible so- 
lutions of ( 12 . 3 ) and ( 12 . 4 ), respectively, over the binary field. Then S 0 and S l are 
given by 


0 

0 

0 

0 

1 

1 

1 

r 


0 

0 

0 

0 

1 

1 

1 

r 

0 

0 

1 

1 

0 

0 

1 

1 

and S’ 0 = 

0 

0 

1 

1 

0 

0 

1 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

1 

0 

1 

0 

0 

1 


1 

0 

0 

1 

0 

1 

1 

0 


In ( 4 , 4 )-VCS, if all the four participants come together to superimpose their shares, 
then only the secret image should be visible. To ensure that we need to check that 
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these two matrices really satisfy the conditions of the basis matrices as in Defini- 
tion 12.12.1. Note that if X = {1, 2, 3, 4}, i.e., if the set X of four participants come 
together and if we assume tx = 8, m = 8 and a(m) = then S%, the “or” of all the 
four rows of S°, satisfies w(S^) < tx — ot(m) - m = 8— ^-8 = 7 whereas, for S 1 , 
w(Sx) > tx = 8. To check the second property of the Definition 12.12.1, we see 
that for any X C {1, 2, 3, 4} with \X\ < 3, S°[X] and S l [X] are identical up to col- 
umn permutation. For example, if we take X = {2, 3, 4} and consider S°[X] and 
^[X], then we see that both the matrices contain identical patters, i.e., exactly one 
(0 0 1 Y column, exactly one (0 1 0)* column, exactly one (1 0 0) r column, exactly 
one (1 1 1 Y column, exactly one (0 0 0) r column, exactly one (0 1 1 Y column, ex- 
actly one (1 0 iy column and exactly one (1 1 0/ column. So the restricted matrices 
S Q [X] and S l [X] are identical up to column permutation. 


12.12.8 Applications of Linear Algebra to Visual Cryptographic 
Schemes for General Access Structure 

In this section we are going to show how liner algebra may play an important role in 
constructing visual cryptographic schemes for general access structure. Until now 
we have discussed about the schemes for (2, n)-V CS for (n, n)-V CS. But for more 
practical applications, we need more flexibility on the access structure. Let us start 
with an example. 


Example 12.12.2 Let us consider a strong access structure on a set of 5 participants 
having the access structure (/Q ua i, Ep or b) with 7~b = {{1, 2}, {3, 4}, {2, 3}, {3, 5}}. To 
construct a VCS for the above mentioned access structure, let us first divide the set 
To into two parts namely 7~bi = {{1,2}, {3, 4}} and Tfe = {{2, 3}, {3, 5}}. For Toi, 
let us consider the two following systems of linear equations over the binary field as 
follows: 


X\ + X 2 = 0 
JC 3 + X 4 = 0 
*5 =0 


X\ + X2 = 1 
X 3 + X 4 = 1 
*5=0 


(12.5) 


( 12 . 6 ) 


Let S j and be the Boolean matrices whose columns are just all possible so- 
lutions of the above two systems of (12.5) and (12.6), respectively, over the binary 
field. 
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#Key Generation algorithm for RSA Cryptosystem 

@ interact 


def RSA key gen ( bits = (32 .. 512) ): 

p = next prime ( randint ( 2 A (bits-1 ) , 2 A (bits)) ) 

q = next prime ( randint ( 2 A (bits-1 ) , 2 A (bits)) ) 

N = p* 

r q 

phiN = 

= (p-l)Mq-l) 

e = 2" 

16 - 1 

while 

gcd(e, phiN) 1= 1: 

e 

= randint(17, 2 A (bits)) 

el = mod(e, phiN) 

dl = el A (-1 ) 

d = ZZ(dl) 

print 

• \ n * 

print 

'the public keys are: ' 

print 

' N= ' ,N 

print 

’ e= ' ,e 

print 

’ \n' 

print 

' the private keys are : ' 

print 

' P= ' / P 

print 

’q=’ ,q 

print 

’ d= ’ , d 

print 

• \n ' 

/, 

evaluate 

[ )sMath 




//. 


Fig. 12.4 RSA key generation algorithm in SAGE 
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We further consider the two following systems of linear equations over the binary 
field: 


X2 + A3 = 0 
X3 + X5 = 0 
X 1 =0 
X 4 = 0 

X 2 + X 3 = 1 
X3 + *5 = 1 
X\ = 0 
0 


( 12 . 7 ) 


( 12 . 8 ) 
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Table 12.9 RSAkey 
generation algorithm 


As before, 

"0 0 
0 1 
S°2= 0 1 
0 0 

_° 1 

Finally, 

"0 0 1 1 0 0 " 

0 0 110 1 

S° = S°||S° = 0 10 10 1 and 

0 10 10 0 

_0 0 0 0 0 1 _ 

"0 0 1 1 0 0 " 

110 0 10 

s x = s {\\ sl = 0 10 10 1 

10 10 0 0 

_0 0 0 0 1 0 _ 



0 

0 


1 

0 

and 5*2 = 

0 

1 


0 

0 


1 

0 


#Key Generation algorithm for RSA Cryptosystem 
@ interact 

def RS A_key_gen( bits = (32 .. 512) ): 
p = next_prime( randint(2 A (bits-l), 2 A (bits)) ) 
q = next_prime( randint(2 A (bits-l), 2 A (bits)) ) 

N = p*q 

phiN = (p-l)*(q-l) 

e = 2 A 16 - 1 

while gcd(e, phiN) != 1: 

e = randint(17, 2 A (bits)) 
el = mod(e, phiN) 
dl = el A (-l) 
d = ZZ(dl) 

print ' \n ' 

print ' the public keys are: ' 
print / N= / ,N 
print / e= / ,e 

print ' \n' 

print 'the private keys are:' 
print / p= / ,p 
print / q= / ,q 
print 'd=',d 

print ’ \n ; 


form the basis matrices for the given access structure having pixel expansion 6. 
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bits 512 


the public keys ares 
N= 

991117167155141129591994002247319984529431204827340799577706895426554087\ 

303901775722513362433298804901669672038978725409255057056640171942736341\ 

304638240963013619358364616928466325678748116548416048793563962573944721\ 

452040307616384182022964133061235498789937543148186463618545080207142196\ 

55190904167761526013 

e= 

694098820520540746819178662281025548588588988574027091176464991044550648\ 

543825157539073213350036338942907876734447959028735037573061115734487926\ 

6181222289 


the private keys are : 

P= 

126428395351544169065417931893402120362654056288500709011012234820498898\ 

721273373729484987202625671115025559399963591247450914090974177562489468\ 

37495670039 

q= 

783935574282392281395920515499452031885254615948627616063340978334702238\ 
037036850227 623267 122762 1656684342649302 16042987233610317 183320978763123\ 
1149990667 
d= 

803926028465944374752865742843913573073601797288444042959805007045268600\ 
19057 1404 137098 107373707034982064489990650296 1198749 102979224668466 1890 1\ 
121054541040618561658756727742638222377008894905024788648601772993336877\ 
69809 193197947584799630326243763624 155776604 1082715602016842955114684577\ 
32043175360665814465 

JS1 


Fig. 12.5 Output of RSA key generation algorithm in SAGE 


The above example actually follows from the following theorem proved in Ad- 
hikari (2013). 

Theorem 12.12.2 For any given strong access structure (/q u ah Tporb) on a set 
V = {1, 2, . . . , n] ofn participants with 7~b = {B \ , B 2 , . . . , Bjf\ where Bi c V/ = 
1,2,...,/: and for any permutation a g Sk, the symmetric group of degree k , there 
exists a strong visual cryptographic scheme (/Qua 1 , /Forb> m ) on V with m = m a , 
where m G is given as follows: 

_ \ £;= 1 ^ k = 21,1 > 1 
m ° ~ [ Y!i=\ + 2 |Bff < 2( + 1 )l“ 1 if k = 21 +1,1 >0. 

Remark As a particular case of a general access structure, we can get a ( k,n)~ 
threshold scheme. For example, if we want to construct a (3, 5)-threshold VCS, 
we can consider the T 0 = {{1, 2, 3}, {1,2, 4}, {1, 2, 5}, {1, 3, 4}, {1, 3, 5}, {1, 4, 5}, 
{2, 3, 4}, {2, 3, 5}, {2, 4, 5}, {3, 4, 5}} and apply Theorem 12.12.2 to get the (3, 5)- 

vcs. 

So, in general, we may have the following theorem for ( k , ft) -VCS as proved 
in Adhikari (2013). 
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File... $ Action. $ Data... $ sage $ — Typeset 
0 Print f 



#RSA Encryption Algorithm using SAGE 

from sage. crypto. util import ascii_to_bin, bin_to_ascii 
@ interact 

def RSA_encryption ( txt = "Test for RSA" , bits = (32 .. 512), 

N= 12268090136639592443, e=2 9674 11 197 ) : 
btxt = ascii_to_bin( txt) 
lbtxt = len(btxt) 
len_block = 2*bits / 2 

number_of_blocks = ceil( lbtxt / len_block ) 
cblockarray=[ ] 
carray=[ ] 

for i in range ( number_of_blocks ) : 

print ' ' 

print 'ROUND = ’ , i+1 

print ' '| 

mbin = btxt( i*len_block : (i+l)*len_block ] 
print 'Plaintext Block = ', bin_to_ascii (mbin) 
m = ZZ ( str(mbin), base=2 ) 

C = Mod(m,N) A e 
CC=ZZ(C) 

cblockarray=CC .binary ( ) 
padclength=2 *bits-ZZ ( len ( cblpckarray ) ) 
for k in range ( padclength ) : 

carray. append ( 'O' ) 
carray . append ( cblockarray ) 
cbin = ’ join (carray ) 

print 'final cipher text in binary', cbin 

evaluate |j«Math| 

Fig. 12.6 RSA encryption algorithm in SAGE 

Theorem 12.12.3 Let (7q u ah fForb) be an access structure on a set V = {1,2, 
. . . , n} of n participants with To = {A c V : \A\ =k},2 <k <n. Then there exists 
a strong (k, n)-VCS with 



and relative contrast a(m) = 


Remark For further studies, the readers may refer (Adhikari and Sikdar 2003; Ad- 
hikari and Bose 2004; Adhikari et al. 2004, 2007; Adhikari and Adhikari 2007; 
Adhikari 2006, 2013; Ateniese et al. 1996a, b; Blundo et al. 1999, 2003; Naor and 
Shamir 1994). 


In the next section, we are going to discuss about a nice application of open- 
source software, known as, SAGE for the implementation of cryptographic schemes 
with very large integers having many digits, such as, integers having more than 300 
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txt Test for RSA 

bits 512 

N 9911171671551411295919940022473199845294312048273407995 

C 16940988205205407468191786622810255485885889885740270911! 


ROUND = 1 


Plaintext Block = Test for RSA 
final cipher text in binary 

10001001 1100 looioioioiiooiiiiioiooooooiioiuoooooooiiioioiooooioooiooom 

lOlllllOllllOlOOOOOOOlOOOOOOOOOOllOOllOOOOllOlllOOOlOOOlllOlllOlOlOOlim 
111110001000111001001111010100010101011001000000110011000101010001100001\ 
loiioioiiooooiooioioioiiooiiiiooiooiooooiiooooioooinoiooioooioooiooonox 
101010110001101110110110101110010011110110101111100100001001001101000101\ 
1010111010 1010100110101111100010100100000101 100000010100111001 1000010101\ 
011010000010101100001110101111110011010010111101111001001010100110000110\ 
011011000 1001 11100111110001010011010100001111100011100100110111001 101000\ 
011110110011000001111010101100011000111111000110110111111000110010110100\ 
01010001000 1000010001100000010000000110100110 11 110001110100110 1100101110\ 
lOOlllllOlllllOlOOOlllOOlOlOlOlOOllOOlOllOlllOlOOOlOOlOlOllOOllllOOHOOlX 
10100011 10 101 11100010000011 11011101111110100 100 101001101100010 1000010100\ 
lOlllOllllOlllOlllOOOllOlOOllOOllOOOlOlllOlOlllOOOlOllllOOOOllllOlOOOOm 
001101101 1000000010000100111010000000010100000 1001101000011110 1001 111100\ 
0001010001011010 r— — , 


Fig. 12.7 Output of RSA encryption algorithm in SAGE 


digits. For actual implementation of public-key cryptographic schemes, such big 
numbers are essential. In the next section, we show how to deal with these numbers. 


12.13 Open-Source Software: SAGE 

Let us start our discussion with open-source software. Free and open-source soft- 
ware (FOSS) or free/libre/open-source software (FLOSS) are a class of softwares 
that are not only free softwares but also open source in the sense that these are 
liberally licensed to grant users the right to use, copy, study, change, and improve 
their designs through the availability of their source codes. This novel approach has 
gained both momentum and acceptance, as the potential benefits have been increas- 
ingly recognized by both individuals and corporations. SAGE is one of such useful 
outcomes of such approach. SAGE (System for Algebra and Geometry Experimen- 
tation) is free as well as open-source mathematics software that is very much useful 
for research and teaching in algebra, geometry, number theory, cryptography, nu- 
merical computation and related areas. The overall goal of SAGE is to create a 
viable, free, open-source alternative to the costly mathematical computational soft- 
wares, such as, Maple, Mathematica, Magma, and MATLAB. SAGE is sometimes 
called SAGEMATH to distinguish it from other uses of the word. Historically, the 
first version of SAGE was released on 24 February 2005 as free and open-source 
software under the terms of the GNU General Public License. The originator and 
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Table 12.10 RSA encryption algorithm using SAGE 
# RSA encryption algorithm 

from sage.crypto.util import ascii_to_bin, bin_to_ascii 
@ interact 

def RS A_encryption (txt = "Test for RSA", bits = (32 .. 512), 

N= 12268090136639592443, e=296741 1 197): 
btxt = ascii_to_bin( txt) #it converts the ASCII to binary, to use it we need to 
# import ascii_to_bin as written in the first line of the 
#program. ascii_to_bin outputs a binary string, 
lbtxt = len(btxt) #the function len provides the length of any string 
len_block = 2*bits/2 #length of each block of the plain text, note that if we take 
# len_bk>ck=2*bits, then it might exceed the range of N 
number_of_blocks = ceil( lbtxt / len_block ) #number of plain text blocks 
cblockarray=[ ] #it will contain the binary representation of c=m A e(mod n), 
Corresponding to each block 

carray =[ ] # it will contain required number of zeros to be appended at the left of 

# the binary representation of c=m A e(mod n) to make it a 2*bits 

# representation of c, corresponding to each block 

for i in range(number_of_blocks): #it runs ROUND =number_of_ blocks times. 

print ' ' 

print 'ROUND = ', i+1 
print ' ' 

mbin = btxt[ i*len_block: (i+l)*len_block ] #mbin contains a binary string 

# of length len_block. We shall convert this binary string into 

# the corresponding decimal representation 
print 'Plaintext Block = ', bin_to_ ascii(mbin) 

m = ZZ( str(mbin), base=2 ) #To compute m A e(mod N), we need to convert 

# the mbin into the corresponding decimal representation 
C = Mod(m,N) A e 

CC=ZZ(C) 

cblockarray=CC. binary () #To convert the CC into a binary of length 2*bits 

# and to store in cblockarray, we need to pad 

# (2*bits— ZZ(len(cblockarray)) zeros to the left of cblockarray 
padclength=2*bits-ZZ(len(cblockarray)) # number of zeros to be padded 
for k in range(padclength): 

carray.append( '0' ) # at the end of the for loop, 

#carray contains padclength many zeros 
carray.append(cblockarray) # appends the contents of carray to the contents 
# of cblockarray and stores it in carray 

# End of the main for loop 

cbin = ' '.join(carray) # at the end of the original for loop, join the contents 

# of "carry" to get the final binary string representing the whole 

# cipher text stored in cbin. 

print 'final cipher text in binary', cbin #print cbin as the cipher text in binary 


the leader of the SAGE project was William Stein, a mathematician at the Univer- 
sity of Washington. SAGE uses the programming language Python. SAGE devel- 
opment uses both students and professionals for development. The development of 
SAGE is supported by both volunteer work and grants. The website for SAGE is 
http://www.sagemath.org. The software may be downloaded from the links avail- 
able in that website. There is on-line computation facility available at that web- 
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Table 12.11 RSA decryption algorithm using SAGE 
# RSA decryption algorithm 

from sage.crypto.util import ascii_to_bin, bin_to_ascii 
@ interact 

def RS A_decryption (ctext = " 00111000110100000111001011001001 
10101111111010010001111100101001'', 

bits = (32 .. 512), d=3319087257124301681,N=6915083087689750517): 
f=[ ] # this is an array to be used latter on 
lctext = len(ctext) #it provides the length of the cipher text 
nb=lctext / (2*bits) # it provides the number of blocks 
cblockarray = [ ] 
for i in range(nb): 

print ' ' 

print 'ROUND = i+1 

print ' ' 

cblockarray = ctext[ i*2*bits: (i+l)*2*bits ] # it actually crops a block 

# of binary string of length 2*bits from the array 

# ctext to store it in the array cblockarray 

c = ZZ( str(cblockarray), base =2 ) # converts binary string into decimal 
P=Mod(c,N) A d # it calculates c d (mod A), as a result, P becomes 
# an element of the ring of integers modulo N 
M = ZZ(P) # converts element of the ring X n into an integer 
Mbin = M.binaryO # Mbin contains the binary representation of M 
padlen = ZZ( 8 — ZZ(len(Mbin)).mod(8) ) #Finally we need to convert the 

# binary to ASCII. So the length of Mbib must be a 

# multiple of 8. padlen actually contains the number of zeros to 

# be padded to the left of Mbin to make it a multiple of 8 
for k in range(padlen): 

f.append( 'O' ) # the array f contains the number of zeros to be padded 
f.append( Mbin ) # it appends the content of Mbin to the content of f 
fbin = ' '.join(f) # it actually contains the plain text in the binary form 
ftxt = bin_to_ascii(fbin) # it converts the binary plain text into actual plain text 
print ' Final recovered message = ', ftxt # final recovered plain text 


site. We can simply create an on-line account at either http://www.sagenb.org or 
http://www.sagenb.kaist.ac.kr and use SAGE with the browser present at our com- 
puter. 

In the next section, we are going to show how SAGE may be used to implement 
three public-key cryptosystems, e.g., RSA, ElGamal, and Rabin cryptosystems with 
large numbers. 


12.13.1 Sage Implementation of RSA Cryptosystem 

We have already discussed about the RSA cryptosystem in Sect. 12.6. Now we are 
going to explain how RSA may be implemented using SAGE with large numbers. 
Recall that, when Alice wants to send secret message to Bob using RSA cryptosys- 
tem, Alice and Bob have to run three algorithms, namely, key generation algorithm 
(done by Bob), encryption algorithm (done by Alice) and decryption algorithm 
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File... $ Action, i Data... J sage J Typeset 


Worksheet I Edit I Text I Undo I Share I Publish 




#SAGE implementation of RSA Decryption 

from sage. crypto. util import ascii_to_bin, bin_to_ascii 
@ interact 

def RSA_decryption ( ctext = 

" 0011100011010000011 100101 1001001101011 11111010010001111100 101001" # bits = 
(32 .. 512 ) , d=33 19 0872 57 124301 68 1 ,N=69 15 083087 689 7505 17 ): 
f=M 

lctext = len( ctext) 
nb=lctext/ ( 2 *bits ) 
cblockarray = [ ] 
for i in range (nb): 

print ' ' 

print 'ROUND ■ i+1 

print ' ' 

cblockarray = ctext [ i*2*bits : (i+l)*2*bits ] 
c - ZZ ( str (cblockarray ) , base=2 ) 

P=Mod(c,N) A d 

M = ZZ(P) 

Mbin = M. binary () 

padlen = ZZ( 8 - Z Z ( len ( Mbin ) ) . mod ( 8 ) ) 
for k in range ( padlen ) s 
f. append ( 'O' ) 
f. append ( Mbin ) 
fbin = ' ' . join( f ) 
ftxt = bin_to_ascii(fbin) 
print 'Final recovered message = ftxt 

4 



CtCXt 1000100111001001010101100111110100000011011100000001110 

bits 512 

d 8039260284659443747528657428439135730736017972884440429 
N 9911171671551411295919940022473199845294312048273407995 


ROUND = 1 

Final recovered message = Test for RSA 


Fig. 12.8 RSA decryption algorithm with output in SAGE 


(done by Bob). First we shall explain the key generation algorithm to be imple- 
mented by Bob. Here we are going to define a function known as “RSA_key_gen” 
which takes as input the security parameter, i.e., the length of each of the primes 
(i.e., number of bits to represent p or q) and produces the public and private keys. 
Here we are using the concept “@interact” so that the function “RSA_key_gen” 
can take input i.e., the bit length of p or q, in an interactive way. Due to the use 
of “@interact”, we can change the bit length (here 32 bits to 512 bits) interactively 
while running the program without changing the main body of the program as shown 
in Fig. 12.4. If we write “@interact def RS A_key_gen (bits = (16 .. 1024)):”, then 
we can vary the length of the primes from 16 bits to 1024 bits interactively. In this 
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Worksheet I Edit I Text I Undo I Share I Publish 


#ElGamal Key Generation Algorithm 
@ interact 

def ElGamal_key_gen ( bits = (16 .. 512) ): 

p = next_prime(randint(2* (bits-1 ) , 2 * (bits )-l ) ) 
g = primitive_root(p) 
a = randint( l,p-l ) 

X = Mod (g,p) A a 

print 1 \n ' 

print 'the public keys are:' 
print ' g= ' , g 
print ’p=’,p 
print ’X=’,X 

print 'the private key is:’ 
print ' a=' , a 


bits 128 


the public keys are: 

g= 3 

p= 229528577508036423867334854552268904441 

X= 56361825206093085535190352362098016333 
the private key is: 

a= 208587833051252351799933094476558972210 


Fig. 12.9 ElGamal key generation algorithm with output in SAGE 


Table 12.12 ElGamal key #Key Generation algorithm for ElGamal Cryptosystem 

generation algorithm @ interact 

def ELGAMAL_key_gen ( bits = (16 .. 512) ): 
p = next_prime( randint(2 A (bits-1), 2 A (bits)-l)) 
g = primitive_root(p) 
a = randint(l,p-l) 

X = Mod (g,p) A a 

print ' \n' 

print ' the public keys are/ 

print ' g=' ,g 

print ' p=' ,p 

print r X= ; ,X 

print ' the private key is : ' 

print ' a=/a 


key generation program, we are going to use the following functions whose utilities 
are described below: 

• randint(a, b): This function outputs a random integer in the interval [a, b]. 
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Table 12.13 ElGamal encryption algorithm using SAGE 
# ElGamal encryption algorithm 
from sage.crypto.util import ascii_to_bin, bin_to_ascii 
@ interact 

def ElGamal_encryption ( txt = "Dr. Avishek Adhikari", bits = (16 .. 128), g = 10, 
p = 230430292394661874796713505744556451393, 

X = 199001885002992147485058529117175239418): 
btxt = ascii_to_bin( txt ) 
lbtxt = len(btxt) 
chr = bits # block size 
lb = ceil( lbtxt / chr ) # no of blocks 
cblockarray=[ ] 

f=[] 

carray 1 = [ ] 
cblockarrayl=[ ] 
carray2=[ ] 
cblockarray2=[ ] 
rl = randint(l,bits-l) 
r=ZZ(rl) 
print 'k=', r 
for i in range(lb): 

mbin = btxt[ i*chr: (i+l)*chr ] 
mbin = btxt[ i*chr: (i+l)*chr ] 
m = ZZ( str(mbin), base=2 ) 

C = Mod(g,p) A r 
CC = ZZ(C) 

cblockarrayl = CC. binary () 

padclengthl = ZZ(bits-ZZ(len(cblockarrayl))) 

for k in range(padclengthl): 

carray l.append( ; 0 ' ) 
carray 1 . append(cblockarray 1 ) 

D = Mod(m*X A r,p) 

DD = ZZ(D) 

cblockarray2 = DD. binary () 

padclength2 = ZZ(bits-ZZ(len(cblockarray2))) 

for k in range(padclength2): 

carray2.append( 'O' ) 
carray2.append(cblockarray2) 
dbin = ' / .join(carray2) 
cbin = ' '.joinicarrayl) 
print ' final cipher text in binary c=', cbin 
print 'd=' , dbin 


• next_prime(a): This function outputs the next prime after a. 

• gcd(a,b): This function outputs the gcd of a and b. 

• mod(a, n): This function outputs an element of the ring Z w , i.e., mod(a, n) returns 
the equivalence class (a) in Z n . 

• e A (— 1): If e is an element of Z*, then e A (— 1) represents the multiplicative in- 
verse of e in Z* . 
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from sage. crypto. util import ascii_to_bin, bin_to_ascii 
@ interact 

def ELGAMAL_test ( txt = "Dr. Avishek Adhikari", bits = (16 .. 

128 ) , g=3 ,p=22 952 857750803642386733485455226890444 1 , X=5636182520609|308 
5535190352362098016333) : 

btxt = ascii_to_bin( txt ) 

lbtxt = len(btxt) 

chr = bits # block size 

lb = ceil( lbtxt / chr ) # no of blocks 

f = N 

carray 1 = [ ] 

cblockarr ay 1 = [ ] 

carray2= [ ] 

cblpckarray2=[ ] 

r 1 = randint ( 1 , bits - 1 ) 

r=ZZ(rl) 

print ' k= ' , r 

for i in range (lb): 

mbin = btxt[ i*chr : (i+l)*chr ] 
m = ZZ( str(mbin), base=2 ) 

C = Mod(g,p) A r 
CC=ZZ(C) 

cblockarrayl=CC .binary ( ) 

padclengthl=ZZ (bits-ZZ ( len(cblockarrayl ) ) ) 
for k in range ( padclengthl ) : 

carray 1 .append ( ’0’ ) 
carray 1 . append ( cblockarray 1 ) 

D = Mod(m*X A r ,p) 

DD=ZZ ( D ) 

cblockarray2=DD. binary ( ) 

padclength2=ZZ (bits-ZZ ( len(cblockarray2 ) ) ) 
for k in range ( padclength2 ) : 

carray2 .append ( '0' ) 
carray2 . append ( cblockarray2 ) 
dbin = ' ' . join(carray2 ) 
cbin = ' ' . join(carrayl ) 

print 'final cipher text in binary c=', cbin 


Fig. 12.10 ElGamal encryption algorithm in SAGE 


• ZZ(d): This function converts the element d from the ring Zi n to the element of 
the ring of integers Z. 

• #: “#” is used to comment a line. 

Now the code for key generation algorithm in SAGE with documentation is written 
in Table 12.9. As screen short of the actual SAGE code with output for 512 bits are 
shown in Figs. 12.4 and 12.5, respectively. 

Now we are going to explain the encryption algorithm. The idea of the encryption 
algorithm is as follows: Suppose Alice is going to encrypt a plain text message. 
Alice will use the function “RSA_encryption()” for encryption. This function takes 
as input the plain text stored in the array “txt”, the security parameter (i.e., the 
number of bits required to represent p or q in binary) stored in the variable “bits”, 
the public keys stored in the variables N and e. The program first converts the plain 
text (ASCII characters) into binary and stores it in the array “btxt”. Suppose the 
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txt Dr. Avishek Adhikari 

bits 128 


g 3 

P 229528577508036423867334854552268904441 
X 56361825206093085535190352362098016333 

k= 70 

final cipher text in binary c= 

000000000000000001 11 101 101 1010100 lOOOOlllOlOOllllllOlllllOOlOOOOOOOllim 
110100101001111100000 101 11 1110011110100000110 111110110010000000000000000\ 
0111101101 1010100100001 1101001 1111 101111 100 10000000 11111110100 10 1001 11 11\ 
0000010111111001111010000011011111011001 
d= 

01100000100101101 111 101 1101010010000111101111000101010010100000010110100\ 
llllOlOOllllOOOllOOllOOOOOlOllllOlllOllOlOOOlOllllOlOOllOlllOlOOHOOllllX 

liioioooiooiioooooioiiooiioioooooiioooooooiooioiooooooiiiiooioioionoim 

0110010010111101011100110001000011101111 


Fig. 12.11 Output of the ElGamal encryption algorithm in SAGE 


length (calculated using “len(btxt)”) of the binary string of the plain text is 3104 
(note that it has to be a multiple of 8, as each ASCII character is represented by 8 
bits). We subdivide the whole plain text string into certain blocks. So we need to 
calculate the block size. For example, if the length of the binary representation of p 
(or q) is 512 bits, then we choose the block size to be 512 * 2/2 = 512 bits. So the 
number of blocks will be the ceiling of (3104/512) = 7, i.e., ceil(3 104/5 12). For 
each such binary block, we convert it into the corresponding decimal representation, 
say m. Note that the value of m is always less than N = pq. Then we calculate c = 
m e (mod N). Next we convert the decimal representation of c into binary. Note 
that we need to pad certain number of zeros at the left of this binary representation 
to make it a binary string of length 2*bits, where the variable “bits” represents the 
number of bits to represent p or q. Finally, we concatenate the content of all the 
sub-blocks to get the final binary representation of the cipher text. The program will 
output the binary representation of the corresponding cipher text. 

In this program, we are going to use the following functions whose utilities are 
explained below. 

• from sage.crypto.util import ascii_to_bin: The function “ascii_to_bin” is de- 
fined in sage.crypto.util. So to use “ascii_ to_bin”, we need to import it from “ 
sage.crypto.util”. 

• from sage.crypto.util import bin_to_ascii: The function “bin_to_ascii” is de- 
fined in sage.crypto.util. So to use “bin_to_ ascii”, we need to import it from “ 
sage.crypto.util”. 

• RSA_encryption(txt,bits,N,e): This function takes as input the plain text, the 
bit length, the public keys N and the encryption exponent e. This function outputs 
the binary representation of the cipher text. 
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Table 12.14 ElGamal decryption algorithm using SAGE 
# ElGamal decryption algorithm 
from sage.crypto.util import ascii_to_bin, bin_to_ascii 
@ interact 

def ElGamal_decryption (c= " 0000000000000000000000000000000000000000000000 
0000000000001101 1000110101 1 1001001 10101101 1 1000101 110111 101010000000000 
00000000000000000000000000000000000000000000000000000000000000000000011 
01 10001 10101 11001001 10101 101 110001011 101 1110101000000000000000000000", 
d= " 0101100101100010001 101001 1001 100000010011010100001 110111 1010010110 
11001100001101 101111 1000001010001 1 1 1 11010001000011010101000001001001010 
11101111000110010010011000001001010110111110110100101110101010000110100 
1101 11 101 11 1001 11 101010001 10100101 1001 101 1101 100", bits =(16 .. 128), 
a= 19531696595743956248020054501166321389, 
p=23043029239466 1 8747967 1350574455645 1393): 
f=[] 

lctextl = len(c) 
lctext2 = len(d) 
nb=lctextl/ (bits) 
cblockarrayl = [ ] 
cblockarray2 = [ ] 
carrayl=[ ] 
for i in range(nb): 

cblockarrayl = c[ i*bits: (i+l)*bits ] 
cl = ZZ( str(cblockarrayl), base=2 ) 
cblockarray2 = d[ i*bits: (i+l)*bits ] 
dl = ZZ( str(cblockarray2), base=2 ) 
x=Mod(cl,p) A (-a) 

P=Mod(x*dl,p) 

M = ZZ(P) 

Mbin = M.binaryO 

padlen = ZZ( 8 - ZZ(len(Mbin)).mod(8) ) 
for k in range(padlen): 

f.append( 'O' ) f.append( Mbin ) 
print 'f=',f 

fbin = ' / .join(f) ftxt = bin_to_ascii(fbin) print ' Final recovered message = ' , ftxt 


• ascii_to_bin( ): It converts the ASCII to binary. To use it we need to import 
ascii_to_bin as written in the first line of the program. “ascii_to_bin” outputs a 
binary string. 

• bin_to_ascii(): It converts the binary to ASCII, to use it, we need to import 
ascii_to_bin as written in the first line of the program. 

• len(btxt): “len” provides the length of a string stored in btxt. 

• ceil(z): It gives the ceiling value of the rational number z. 

• Mod(a,N) A e: It computes the value a e (mod N ). 

• CC.binary(): It converts the decimal value stored in CC into binary. 

• carray.append(cblockarray): The function “appendO” appends the contents of 
the array “carray” to the contents of the array “cblockarray” and stores it in the 
array “carray”. 

• ' / .join(carray): The function “join()” joins the contents of the array “carray” to 
get a string. 
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from sage. crypto. util import ascii_to_bin, bin_to_|ascii 
@ interact 

def Elgamal_decryption (c= 

"oooboo’o’odooooooooiiiioiioiioioiooiooooiiioiooiiiiiioiiinooioooooooi 
111111010010100111110000010111111001111010000011011111011001000000000 
000000001111011011010100100001110100111111011111001000000011111110100 
1010011 111000001011111 100 111 101000001 101111 101 1001 ", d= 
"01100000100101101111101110101001000011110111100010101001010000001011 
010011110100111100011001100000101111011101101000101111010011011101001 
100111111101000100110000010110011010000011000000010010100000011110010 
10101101110110010010111101011100110001000011101111" , bits =(16 .. 

128 ) , a=2 085 8783305 12523517999330944765589722 10 , p=2 2 952 85 7 750803 642 3 86 
7334854552268904441) : 

f=M 

lctextl = len(c) 
lctext2 = len(d) 
nb=lctextl/ (bits ) 
cblockarrayl = [ ] 
cblockarray2 = [ ] 
for i in range (nb): 

cblockarrayl = c[ i*bits : (i+l)*bits ] 
cl = ZZ( str ( cblockarrayl ) , base=2 ) 
cblockarray2 = d[ i*bits : (i+l)*bits ] 
dl = ZZ( str ( cblpckarray2 ) , base=2 ) 
x=Mod(cl,p) A (-a) 

P=Mod(x*dl,p) 

M = ZZ(P) 

Mbin = M. binary () 

padlen = ZZ( 8 - ZZ ( len(Mbin) ) .mod( 8 ) ) 

for k in range ( padlen ) : ^ 

f. append ( '0' ) 

f . append ( Mbin ) 
print ' f=',f 
fbin = ' ' . join( f ) 
ftxt = bin_to_ascii(fbin) 

print 'Final recovered messag e = * , ftxt ^ 

evaluate 


Fig. 12.12 ElGamal decryption algorithm in SAGE 


The SAGE code with documentation is explained in Table 12.10. 

The actual SAGE code for RSA encryption program with output are shown in 
Figs. 12.6 and 12.7, respectively. 

We are now going to explain the decryption algorithm done by Bob. The 
RSA_decryption() function takes as input the cipher text in binary stored in the ar- 
ray “ctext”, the security parameter stored in “bits”, the decryption exponent d stored 
in “d” and the public value stored in “N”. This function outputs the plain text back. 
The program works as follows: First the binary cipher text is divided into blocks, 
each of size 2*bits. Then each binary block is converted into decimal number, say c. 
For each of the decimal representation c, we need to compute c d (mod N) to get M. 
Then we convert M into binary string of length a multiple of 8 by suitably padding 
zeros to the left. Append each such binary string obtained from each block one after 
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C 0000000000000000011110110110101001000011101001111110111 
d 0110000010010110111110111010100100001111011110001010100 
bits 128 


a 208587833051252351799933094476558972210 
p 229528577508036423867334854552268904441 

f= ['O', 

' loooioooiiiooioooioiiioooiooooooioooooioiiioiiooiioiooioiiiooiionoiooox 
0110 0101011010110010000001000001011001000110100001101001 ’ ] 
f= [’O’, 

' loooioooniooioooioiiioooiooooooioooooioiiioiiooiioiooioiiiooiioiioiooox 

01100101011010110010000001000001011001000110100001101001’ , 'O' , 

' 1101011011000010111001001101001 ' ] 

Final recovered message = Dr. Avishek Adhikari 


Fig. 12.13 Output of the ElGamal decryption algorithm in SAGE 


Table 12.15 Rabin key generation algorithm 
#Key Generation algorithm for Rabin Cryptosystem 
@ interact 

def Rabin_key_gen( bits = (16 .. 128) ): 

P=1 

q=l 

while(p==q): 
while p%4!=3: 

p = next_prime( randint(2 A (bits-l), 2 A (bits)) ) 
while q%4!=3: 

q = next_prime( randint(2 A (bits-l), 2 A (bits)) ) 

print ' \n ' 

print 'The private keys are: ' 
print 'p= ',p 
print 'q= ',q 

print ' \n ' 

N=p*q 

print 'The public key is: ' 
print 'N= ',N 


another. Finally convert this binary string into ASCII characters to get back the orig- 
inal plain text. The SAGE code with documentation for RS A decryption is presented 
in Table 12.11. 

A screen short of the actual SAGE code for RSA decryption algorithm and its 
output are shown in Fig. 12.8. 
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@ interact 





def Rabin 

( bits=(16. .128) ) 




p=l 





q=l 





while (p==q) s 




while p%4!=3: 





p = next_prime( 

randint( 2 A (bits-1 ) , 

2*(bits) ) ) 


while (q%4l=3): 





q = next_prime( 

randint( 2 A (bits-1 ) , 

2*(bits) ) )| 


print 





print 

' The private keys 

are: ' 



print 

’P= \P 




print 

'q= Gq 




print 





N=p*q 





print 

'The public key is: ' 



print 

' N= ' , N 



A 


bits 16 


The 

private keys are: 

P= 

58679 

q= 

38903 

The 

public key is: 

N= 

2282789137 


Fig. 12.14 Rabin key generation algorithm with output in SAGE 

12.13.2 SAGE Implementation ofElGamal Plulic Key 
Cryptosystem 

The EnGamal public-key cryptosystem is already explained in Sect. 12.7. The im- 
plementation of ElGamal is similar to that of RS A. Here also we need to implement 
three algorithms, namely, key generation algorithm (run by Bob), encryption algo- 
rithm (run by Alice) and the decryption (run by Bob). For the key generation, the 
function ‘ ‘ELG AM AL_key_gen ()” takes as input the security parameter stored in 
the variable “bits” and out puts the public and the public values. In this algorithm, 
we use the function “primitive_root(n)” which returns a generator for the multiplica- 
tive group of integers modulo n, if one exists. The key generation algorithm for the 
ElGamal cryptosystem is in Table 12.12. 

A screen short of the actual SAGE code for ElGamal key generation algorithm 
along with the output is shown in Fig. 12.9. 

The SAGE code for ElGamal encryption is presented in Table 12.13. 

A screen short of the actual SAGE code for ElGamal encryption algorithm along 
with the output are shown in Figs. 12.10 and 12.11, respectively. 

The SAGE code for ElGamal decryption is presented in Table 12.14. 

A screen short of the actual SAGE code for ElGamal decryption algorithm along 
with the output are shown in Fig. 12.12 and 12.13 respectively. 
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Table 12.16 Rabin encryption algorithm using SAGE 
# Rabin encryption algorithm 
from sage.crypto.util import least_significant_bits 
from sage.crypto.util import ascii_to_bin 
@ interact 

def Rabin_encryption (txt = "A",N=2245559221,bits=16): 
btxt = ascii_to_bm( txt ) # binary representation of the input in a multiple of 8 
Cl_array=[ ] # the array will help to store the cipher text Cl in binary 
C2_array=[ ] # the array will help to store the cipher text C2 in binary 

Cl_bk>ckarray=[ ] 

lbtxt=len(btxt) # length of the input binary string 

R=IntegerModRing(N) # we shall work in the Ring Z_N, that is, the ring of 
#integers modulo N 

for i in range(lbtxt): 

print ' \n ' 

print 'ROUND = i+1 

print ' \n ' 

m=btxt[i] #since btxt contains binary string, we need to convert it to 
# integer in the next step 

m = ZZ( str(m), base=2 ) # integer representation of m from string 

y=R.random_element() 

while (GCD(y,N)! = l): 

y=R.random_element() #y is a random element from Z_N* 
r=Mod(y,N) A 2 # random element from QR_N and we need the integer value 
x=ZZ(r) # integer value of r, so x is a random element from QR_N 
cl=ZZ(Mod(x,N) A ) # 1st pair of the cipher text in integer form, 

# we need to put it in binary, so ZZ is required 
lsb=least_significant_bits(x, 1) # lsb is an array contains the lsb of x 

# in the zeroth location of lsb[ ] 
c2=(lsb[0]+m)%2 # 2nd pair of cipher text 

Cl_blockarray=cl. binary () #cl has to represent in binary representation of 
# length of N, so we need to first make it binary and pad suitably 
padclength=ZZ(2*bits-ZZ(len(Cl_blockarray))) 
for k in range(padclength): 

Cl_array.append( 'O' ) 

Cl_array.append(Cl_blockarray) # the step in which the consecutive 

# binary representations of Cl’s are appended 
C2_bk>ckarray=c2.binary() # though c2 is in binary, to append, we need to 

# make it a string 

C2_array.append(C2_blockarray) # the step in which the consecutive binary 

# string representations of C2’s are appended 
Cl_bin = ".join(Cl_array) 

C2_bin = ".join(C2_array) 

print 'The final Cipher for Cl is: ', Cl_bin 

print 'The final Cipher for C2 is: ', C2_bin 


12.13.3 SAGE Implementation of Rabin Plulic Key Cryptosystem 


We have already explained the Rabin public-key cryptosystem in Sect. 12.8. In this 
section, we are going to implement this cryptosystem in SAGE. Like RSA and El- 
Gamal, Rabin cryptosystem also has three major parts, namely key generation, en- 
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Table 12.17 Rabin decryption algorithm using SAGE 

# Rabin decryption algorithm 

from sage.crypto.util import least_significant_bits, ascii_to_bin, bin_to_ascii 
@ interact 

def Rabin_decryption(c 1 = " 01101111000111110000011101100100011001101011010 
11001 10010001 11 1001 11 10001 110001 11 100001 101 11 1001 1001 10", 
c2="00010011",p=34351,q=65371,bits=16): 

N=p*q 

R=IntegerModRing(N) 

f=[] 

len_cl=len(cl) 

blocks=len_cl/(bits*2) # no of blocks each of length of N i.e., 2*bits 
for i in range(blocks): 
cblockarray = cl[ i*2*bits: (i+l)*2*bits ] 

c = ZZ( str(cblockarray), base=2 ) # decimal representation of Cl in ith block 

cp=c%p 

cq=c%q 

ap=Mod(cp,p) A ((p+l)/4) # to calculate the 4 values of sq roots 
aq=Mod(cq,q) A ((q+l)/4) # to calculate the 4 values of sq roots 
apl=ZZ(ap) # to calculate the Ligendre Symbol, int value is required. That is 
# why we need to convert the ring element into integer value 

aql=ZZ(aq) 

ap2=ZZ(-ap) 

aq2=ZZ(-aq) 

ls_apl=legendre_symbol(apl,p) # Ligendre symbol for apl modulo p 

ls_ap2=legendre_symbol(ap2,p) 

ls_aq 1 =legendre_symbol(aq 1 ,q) 

ls_aq2=legendre_symbol(aq2,q) 

if ((ls_apl==l) & (ls_aql==l)): # If both the values are 1, then we can find 

# the required element x by using the Chinese Remainder Theorem (CRT) 
x=crt(ap 1 ,aq 1 ,p,q) 

if ((ls_apl==l) & (ls_aq2==l)): 

x=crt(apl,aq2,p,q) 
if ((ls_ap2==l) & (ls_aql==l)): 

x=crt(ap2,aql,p,q) 
if ((ls_ap2==l) & (ls_aq2==l)): 
x=crt(ap2,aq2,p,q) 

xl=ZZ(x) # As x is in the ring of integers modulo pq, to use the function 

# least_significant_bits(), we need to convert it in integer 
lsb_xl=least_significant_bits(xl, 1) # least significant bit of the element 
c_2_blockarray = c2[ i: (i+1)] # As c2 is a string of binary, we need to 

# convert it from string to integer in the next line 
cc2 = ZZ( str(c_2_blockarray), base=2 ) 
bin_plain_text= (lsb_x 1 [0] +cc2) % 2 
string_bm_plain_text=bin_plain_text. binary!) 
f.append(string_bin_plain_text) # f is the final binary 

# array in string format to contain the bin plain text 

# out of the main for loop 

fbin = ".joh^f) # it joins the elements of f to use bin_to_ascii 

ftxt = bin_to_ascii(fbin) 

print 'Final recovered message = ', ftxt 



12.13 Open-Source Software: SAGE 


579 


from sage. crypto. util import least_signif icant_bits 
from sage. crypto. util import ascii_to_bin 
@ interact 

def Rabin_encryption (txt = "Bob" ,N=2282789137,bits=16 ) : 
btxt = ascii to_bin( txt ) 

Cl_array=[ ] [ 

C2_array=[ ] 

Cl_blockarray=[ ] 
lbtxt=len(btxt) 

R=IntegerModRing ( N ) 
for i in r ange ( lbtxt ) : 
m=btxt [ i ] 

m = ZZ( str(m), base=2 ) 
y=R.random_element( ) 
while (GCD(y,N) 1=1): 

y=R.random_element ( ) 
r=Mod(y ,N) A 2 
x=ZZ (r ) 

cl=ZZ (Mod(x,N) A 2 ) 
lsb=least_signif icant_bits ( x, 1 ) 
c2=( lsb[ 0 ] +m) %2 
Cl_blockarray=cl . binary ( ) 

padclength=ZZ ( 2*bits-ZZ (len(Cl_blockarray ) ) ) 
for k in range ( padc length ) : 

C l_ar r ay . append ( 'O' ) 

C l_ar ray . append ( C 1 _b loc k arr ay ) 

C2_blockarray=c2 . binary ( ) 

C2_array . append ( C 2 _b loc k arr ay ) 

Cl_bin = ' ' . join(Cl_array ) 

C2_bin = ' ' . join(C2_array ) 

print 'The final Cipher for Cl is : ' , Cl_bin 

print 'The final Cipher for C2 is : C2_bin |>Math 


Fig. 12.15 Rabin encryption algorithm in SAGE 


cryption, and decryption. In the key generation step, for a given security parameter 
“bits”, we need two random primes p and q of length “bits” such that p = q = 
3 (mod 4). 

The SAGE code for Rabin key generation is presented in Table 12.15. 

A screen short of the actual SAGE code for Rabin key generation algorithm along 
with the output are shown in Fig. 12.14. 

For encryption, we have to import the functions “least_significant_ bits” and 
“ascii_to_bin” by writing “from sage.crypto.util import least_significant_bits” and 
“from sage.crypto.util import ascii_to_ bin” at the very beginning of the encryp- 
tion program. Here we shall deal with the Ring Z#, that is, the ring of integers 
modulo N. After the use of the command “R=IntegerModRing(N)”, R becomes the 
ring of integers modulo N. We further require a random element from R. The com- 
mand “R.random_ element()” returns a random element from the ring of integers 
modulo N. Further, the command “least_significant_bits(x, 1)” returns the least sig- 
nificant bit of x. As mentioned earlier, to use this function, we need to import it from 
“ sage.crypto.util”. 
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txt Bob 
N 2282789137 
bits ~16 


The final Cipher for Cl is s 

00 110 110011000 1110010 10000000 11 1000000 1010 111110111 111 1111010 lOOOOOOlOllX 
01 101 110010 11 11 10 10 1100000101 11 100000000 10 1001100 11 10 101011 101 101 11100 11 \ 
000110010101 1000010001001 110101000011110110 lOlOOOOOOOlllllOOlllllOllllOlX 
100000110010100010110100101111001111000100111011010011010111101000011110 \ 
1000000111100001001010111010000001111111 100 11 100000 10000000 1000000110 111 \ 
OOOllllOllllOOOlllllOOlOOlllOllOOlOOOOOOOlOOllllOlllOOlOOllOOllllOlOOim 
01 101000001010 10000 1001111010011000111 1101010010011 111110011110001011 110 \ 
001100000101010001110000101100110010110101100000100100010110000000110001 \ 
OlOOlllllllOlOOllOOOlllllOOOOOOOOOOOlOllOllOlOlllOlOOOOOllllOOllOOOOlOOlX 
OOOlOOlllOlOOlOllOOllOOOOllllOOlllOlllOOllOlOOlOlOOlOlOlOOllOOOOOlOllOm 
111010011101100101110101100000110001110110111111 
The final Cipher for C2 is : 101110010110101110000010 


Fig. 12.16 Output of the Rabin encryption algorithm in SAGE 


For the encryption algorithm we use the function “Rabin_encryption()’\ As an 
input, this encryption function takes the plain text stored in “txt”, the values N and 
the security parameter “bits”. This function returns a pair of binary string of cipher 
texts stored in Cl_bin and C2_bim 

For decryption algorithm, we use the function “Rabin_decryption()” which takes 
as input the two binary strings of cipher texts stored in the variables cl and c2, the 
values of p, q, and the security parameter “bits”. It outputs the original plain text. 
In this encryption algorithm, we have used all the previously explained functions 
except the two new functions, namely “legendre_symbol(a,p)” and the function 
“crt(a,b,p,q)”. The function “legendre_symbol(a,p)” computes the Legendre sym- 
bol binomap , where p is a prime integer. On the other, the function “crt(a,b,p,q)” 
returns a solution to the congruences x = a (mod p) and x = b (mod q) using the 
Chinese Remainder Theorem. 

The SAGE code for Rabin encryption scheme with documentation is presented 
in Table 12.16. 

Screen shorts of the actual SAGE code for Rabin encryption algorithm along 
with the output are shown in Fig. 12.15 and Fig. 12.16, respectively. 

The SAGE code for Rabin decryption scheme with documentation is presented 
in Table 12.17. 

Screen shorts of the actual SAGE code for Rabin decryption algorithm along 
with the output are shown in Fig. 12.17 and Fig. 12.18, respectively. 


12.14 Exercises 

1. Find the primes p and q if n = pq = 4386607 and 4>(n ) = 4382136. 
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from sage. crypto. util import bin_to_ascii 
§ interact 

def Rabin_decryption 

(cl=" 0011011001100011100101000000011100000010101111101111111111010100000010110 
110111001011111010110000010111100000000101001100111010101110110111100110001100 
101011000010001001110101000011110110101000000011111001111101111011000001100101 
000101101001011110011110001001110110100110101111010000111101000000111100001001 
010111010000001111111100111000001000000010000001101110001111011110001111100100 
111011001000000010011110111001001100111101001110110100000101010000100111101001 
100011111010100100111111100111100010111100011000001010100011100001011001100101 
101011000001001000101100000001100010100111111101001100011111000000000001011011 
010111010000011110011000010010001001110100101100110000111100111011100110100101 
0010101001100000101101 1111010011 1011001011 10101 1000001 10001 110110111111 ",02=" 1 
01110010110101110000010" ,p=58679,q=38903,bits=16) s 
N=p*q 

R=IntegerModRing ( N ) 

f=U 

len_cl=len(cl ) 
blocks=len_cl/ (bits*2 ) 
for i in range ( blocks ) : 

cblockarray = cl[ i*2*bits : (i+l)*2*bits ] 
c = ZZ( str (cblockarray ) , base=2 ) 
cp=c%p 
cq=c%q 

ap=Mod ( cp , p ) A ( ( p+ 1 ) / 4 ) 
aq=Mod ( cq , q ) A ( ( q+ 1 ) / 4 ) 
apl=ZZ (ap) 
aql=ZZ (aq) 
ap2=ZZ(-ap) 
aq2=ZZ ( -aq) 

1 sap 1 =legendre_s ymbo 1 ( apl , p ) 
ls_ap2=legendre_symbol ( ap2 , p ) 
ls_aql=legendre_symbol ( aql , q ) 
ls_aq2=legendre_symbol ( aq2 , q ) 
if ( ( ls_apl==l ) & ( ls_aql==l ) ) : 

x=crt(apl,aql,p,q) 
if ( ( ls_apl==l ) & ( ls_aq2==l ) ) : 

x=crt ( apl , aq2 , p , q ) 
if ( ( ls_ap2==l ) & ( ls_aql==l ) ) : 

x=crt(ap2,aql,p,q) 
if ( ( ls_ap2==l ) & ( ls_aq2==l ) ) : 

x=crt(ap2,aq2,p,q) 
xl=ZZ (x) 

lsb_xl=least_signif icant_bits ( xl , 1 ) 
c_2_blockarray = c2[ i: (i+1)] 
cc2 = ZZ( str (c_2_blockarray ) , base=2 ) 
bin_plain_text=( lsb_xl [ 0 ]+cc2 ) %2 
string_bin_plain_text=bin_plain_text . binary ( ) 
f . append ( string_binjplain_text ) 
fbin = ' ' . join(f ) 
ftxt = bin_to_ascii( fbin) 

print 'Final recovered message = ftxt 


Fig. 12.17 Rabin decryption algorithm in SAGE 


2. Decrypt each of the following Caesar encryptions by trying the various possible 
shifts until you obtain a meaningful text. 

(a) LWKLQNWKDWLVKDOOQHYHUVHHDELOOERDUGORYHOBDV 
DWUHH 

(b) UXENRBWXCUXENFQRLQJUCNABFQNWRCJUCNAJCRXWORW 
MB 


582 


12 Introduction to Mathematical Cryptography 


evaluate 




cl 

0011011001100011100101000000011100000010101111101111111 

c2 

101110010110101110000010 

P 

58679 

q 

38903 

bits 

16 

Final recovered message = Bob 


Fig. 12.18 Output of the Rabin decryption algorithm in SAGE 


(c) BGUTBMBGZTFHNLXMKTIPBMAVAXXLXTEPTRLEXTOXKHHFY 
HKMAXFHNLX 

3. Use the key K = (5, 11) e 1C to encrypt the message “ILOVECRYPTOGRA 
PHY” using Affine cipher, assuming that V — Z 26 . From the cipher text, de- 
scribe an attack, assuming that the key K is unknown to the adversary. Calcu- 
late maximum how many trials the adversary has to make before breaking the 
scheme. 

4. Give an example of a cryptographic scheme which is poly- alphabetic. 

5. With the key AVI encrypt the plain text ‘ AB STR ACTALGEB R A ’ using Vi- 
genere Cipher. 

6. With the key k = [” ], encrypt the plain text “CRYPTOGRAPHYISFUN” us- 
ing Hill 2-cipher. 

7. What are the advantages of using public-key cryptosystem over private key 
cryptosystem. 

8. Let g be a primitive root for Z p , for some prime p. Suppose that x = a and 
x = b are both integer solutions to the congruence g x = y (mod p). Then 

(a) prove that a = b (mod p — 1). 

(b) Prove that log g (h\ • h 2 ) = \og g (hi) + lo g g (h 2 ) for all h\, h 2 e Z* . 

9. Mount a man-in-middle attack for RSA cryptosystem. 

10. Mount a bidding attack on RSA cryptosystem. 

11. Construct basis matrices for a black and white visual cryptographic scheme 
with To = {{1, 2, 3}, {2, 3, 4}}. What could be the pixel expansion and relative 
contrast of the scheme. 

12. Using SAGE, implement the Diffie-Hellman key agreement protocol for a 512 
bit prime. 

13. Using SAGE, implement the bidding attack on RSA with 512 bit primes. 

14. Using SAGE, implement the RSA signature scheme with 512 bit primes. 

15. Using SAGE, implement Shamir’s secret sharing scheme in the field Z p with 
512 bit prime p. 


References 
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12.15 Additional Reading 

We refer the reader to the books (Adhikari and Adhikari 2007; Menezes et al. 1996; 
Stinson 2002; Katz and Lindell 2007) for further details. 
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Appendix A 

Some Aspects of Semirings 


Semirings considered as a common generalization of associative rings and dis- 
tributive lattices provide important tools in different branches of computer science. 
Hence structural results on semirings are interesting and are a basic concept. Semi- 
rings appear in different mathematical areas, such as ideals of a ring, as positive 
cones of partially ordered rings and fields, vector bundles, in the context of topolog- 
ical considerations and in the foundation of arithmetic etc. In this appendix some 
algebraic concepts are introduced in order to generalize the corresponding concepts 
of semirings N of non-negative integers and their algebraic theory is discussed. 


A.l Introductory Concepts 

H.S. Vandiver gave the first formal definition of a semiring and developed the the- 
ory of a special class of semirings in 1934. A semiring S is defined as an algebra 
( S , +, •) such that ( S , +) and ( S , •) are semigroups connected by a(b + c) = ab + ac 
and ( b + c)a = ba + ca for all a, b, c e S. The set N of all non-negative integers with 
usual addition and multiplication of integers is an example of a semiring, called the 
semiring of non-negative integers. A semiring S may have an additive zero o defined 
by o + a = a + o = a for all a e S or a multiplicative zero 0 defined by Oa = aO = 0 
for all a e S. S may contain both o and 0 but they may not coincide. Consider the 
semiring (N, +, •)* where N is the set of all non-negative integers; 

a + b = {1cm of a and b, when a ^ 0, b ^ 0}; 

= 0, otherwise; 


and 


a • b = usual product of a and b. 

Then the integer 1 is the additive zero and integer 0 is the multiplicative zero of 
(N, +, •)• 


M.R. Adhikari, A. Adhikari, Basic Modern Algebra with Applications, 
DOI 10.1007/978-81-322-1599-8, © Springer India 2014 
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In general, the concept of multiplicative zero 0 and additive zero o may not coin- 
cide in a semiring. H .J. Weinert proved that both these concepts coincide if ( S , +) 
is a cancellative semigroup. Clearly, S has an absorbing zero (or a zero element) iff 
it has elements 0 and o which coincide. Thus an absorbing zero of a semiring S is 
an element 0, such that 0a = aO = 0 and 0 + a = a + 0 = a for all a e S. 

A semiring S may have an identity 1 defined by la = a l = a, for all a e S. 

A semiring S is said to be additively commutative, iff a + b = b + a for all 
a, b e S. An additively commutative semiring with an absorbing zero 0 is called a 
hemiring. An additively cancellative hemiring is called a half ring. A multiplicatively 
cancellative commutative semiring with identity 1 is called a semidomain. 

We mainly consider semirings S for which ( S , +) is commutative. If (S', •) is 
also commutative, S is called a commutative semiring. Moreover, to avoid trivial 
exceptions, each semiring S is assumed to have at least two elements. 

A subset A / 0 of a semiring S is called an ideal (left, right) of S iff a + b e 
A,sa e A and as e A(sa e A, as e A) hold, for all a,b e A and all s e S. An ideal 
A of S is called proper, iff A c S holds, where C denotes proper inclusion, and a 
proper ideal A is called maximal, iff there is no ideal B of S satisfying A C B c S. 
An ideal A of S is called trivial iff A = S holds or A = {0} (the latter clearly holds 
if S has an absorbing zero 0). 

Many results in rings related to ideals have no analogues in semirings. As noted 
by Henriksen, it is not the case that any ideal A of a semiring S is the kernel of a 
homomorphism. To get rid of these difficulties, he defined in 1958, a more restricted 
class of ideals in a semiring which he called k-ideals. A /: -ideal A of a semiring S 
is an ideal of S such that whenever x + a e A, where a e A and x e S, then x e A. 
Iizuka defined a still more restricted class of ideals in semirings, which he called 
h -ideals. An h- ideal A of a semiring S is an ideal of S such that if x + a + u = 
b-\-u, where x,u e S and a,b e A, then x e A. It is clear that every h -ideal is a k- 
ideal. But examples disapprove the converse. Using only commutativity of addition, 
the following concepts and statements essentially due to Bourne, Zassenhaus, and 
Henriksen, are well known. For each ideal A of a semiring S , the ^-closure A = 
{a e S : a + a\ = a^ for some a\ e A} is a k-ideal of S satisfying A c A and A = A. 
Clearly, an ideal A of S is a k -ideal of S iff A = A holds. A proper k -ideal A of S 
is called a maximal &-ideal of S iff there is no k-ideal B of S satisfying A C B c S. 
An ideal A of a semiring S is called completely prime or prime iff ab e A implies 
a e A or b e A, for all a,b e S. 

Let S and T be hemirings. A map / : S — > T is said to be a hemiring homomor- 
phism iff f (s i + s 2 ) = f(si) + f(s 2 ); f(sis 2 ) = f(s 2 )f(s 2 ) and /( 0 ) = /( 0 ). 

A hemiring homomorphism / : S — > T is said to be an A -homomorphism, iff 
whenever f(x) = f(y) for some x, y e S, there exist 77 e ker / such that x + r\ = 
y + r 2 holds. 

An equivalence relation p on a semiring S is called a congruence relation iff 
( 1 a , b) g p implies (fl + c,Hc)ep, (c< 2 , cZ?) e p and (ac, Z?c) e p for all 
A congruence relation p on a semiring S' is said to be additively cancellative (AC) 
iff (a + c, b + d) g p, and (c, d) e p imply (a, Z?) e p. Each ideal A of S’ defines 
a congruence relation p^ on ( S , +, •) given by p A = {(ij) e5x5:r+fli = 
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y + ci 2 for some ai e A}; this is known as Bourne congruence. The corre- 
sponding class semiring S/pa , consisting of the classes xpa , is also denoted 
by S/A. Each ideal A of S defines another type of congruence relation known 
as Iizuka congruence defined by cr^ = {(x, y) e S’ x S\x-\-a + u = y-\- 
b + u for some a, b e A and u e S}. The corresponding class semiring S /a a consists 
of the classes xcta, x e S. 

Bourne and Zassenhaus defined the zeroid Z(S) of a semiring S as Z(S) = {ze 
S : z + x = x for some x e S}; it is an h- ideal of S and is contained in every h- ideal 
of S. Z(S) and S are called trivial h -ideals and other h -ideals (if they exist) are 
called proper h -ideals of S. A proper h -ideal A of S is called a maximal h -ideal iff 
there is no h- ideal B of S satisfying A C B c S. The h -closure A h of each ideal 
A of a semiring S defined by A h = {x e S : x + a\ + u = + u for some a\ e 

A and u e S} is the smallest h- ideal of S containing A. 

A semiring S is said to be additively regular iff for each a e S, there exists an 
element b e S such that a = a + b + a. If in addition, the element b is unique and 
satisfies b = b + a + b, then S is called an additively inverse semiring. 

A semiring S is said to be semisubtractive iff for each pair a, b in S at least one 
of the equations a -\- x = b or b + x = a is solvable in S. 

Let S be a semiring with absorbing zero 0. A left semimodule over S is a com- 
mutative additive semigroup M with a zero element 0 together with an operation 

S x M -> M\ (a,x)\-^ax 
called the scalar multiplication such that 

for all a, b e S, x, y e M, a(x + y) = ax + ay, 

(a + b) x = ax + bx , (ab)x = a(bx), Ox = 0. 

A right S-semimodule is defined in an analogous manner. Let H and T be hemir- 
ings. An (H , T) bi-semimodule M is both a left semimodule over H and a right 
semimodule over T such that (bx)a = b(xa) for all x e M, b e H and a eT. 

Let R be an arbitrary hemiring. Let S be also a hemiring. R is said to be an 
S -semialgebra iff R is a bi-semimodule over S such that (ax)b = a(xb), for all 
a,b e R and x e S. 

An h -ideal A of the S- semialgebra R is said to be a left modular h- ideal iff there 
exists an element e e R such that 

(i) each x e R satisfies ex + a + z= x + b + z for some z £ R and some a,b e A; 

(ii) each s e S satisfies se + c + h = es + d + h for some h e R and some c,d e A. 

In this case e is called a left unit modulo A. 

A pair (e\ , ei) of elements of a semiring S' is called an identity pair iff a + e\a = 
^ 2*2 and a + ae \ = ae 2 , for all 

We define a right modular h- ideal in a similar manner. A congruence p on a 
semiring S is called a ring congruence iff the quotient semiring S/p is a ring. 
A semiring S' who o and 1 is said to be /z-Noetherian (k-Noetherian) iff it satis- 
fies any one of the following three equivalent conditions: 
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(i) S satisfies the ascending chain condition on h -ideals (^-ideals); 

(ii) The maximal condition for h -ideals (^-ideals) holds in S ; 

(iii) Every h - ideal (&-ideal) in S is finitely generated. 


A.2 More Results on Semirings 

In this section we present more results of algebraic theory of semirings which are 
closely related to the corresponding results of ring theory. 

Theorem A.2.1 Let S be a semiring such that S = (a\, a 2 , . . . , a n ) is finitely gen- 
erated. Then each proper k-ideal A of S is contained in a maximal k-ideal of S. 

Proof Let k(A) be the set of all & -ideals B of S satisfying A c B C S, partially 
ordered by inclusion. Then k(A) / 0, for A itself belongs to k(A). Consider a chain 
{Bi, i e 1} in k(A). We claim that B = U(/e/) Bi * s a proper &-ideal of S. Let 
a,b e B and s e S. Then there exist i, j e I such that a e Bi and b e Bj. As 
{ Bi , i e 1} forms a chain, either Bi c Bj or Bj c Bi. Lor definiteness, we sup- 
pose Bi c Bj , so that both a,b e Bj. Since Bj is a /: -ideal of S, a + b e Bj C5, 
as, g Bj c 5 for all .s 1 g 5 1 . Again, ifa + xG.S^G.B^G^, then proceeding 
as above, a + x, a g for some Bj of the above chain. Since Bj is a &-ideal, it 
follows that x g Bj c As a result 5 is a ^-ideal of 5. Next we verify that 5 is 
a proper &-ideal of S'. Suppose to the contrary B = S = (a\, a 2 , . . . , a n ) . Then each 
a r would belong to some k -ideal Bi r of the chain {Bi}, where r = 1,2, . . . ,n. There 
being only finitely many Bi r ’s, one contains all others, let us call it B t , while tel. 
Thus a\ , ^ 2 , . . . , a n all lie in this B t . Consequently, B t = S, which is clearly impos- 
sible. As a result B e k(A). Thus by Zorn’s Lemma k(A) has a maximal element as 
we were to prove. □ 

Corollary Let S be a semiring with 1. Then each proper k-ideal of S is contained 
in a maximal k-ideal of S. 

Proof The proof is immediate by S = (1). □ 

We now consider conditions on a semiring S such that S has non-trivial ^-ideals 
or maximal ^-ideals, among others by the help of the congruence class semiring 
S/A defined by an ideal A of 5. Let S' denote ^ = S\{ 0}, if 5 has 0, and = 5, 
otherwise. 

Definition A.2.1 A semiring S is said to satisfy the condition (C') iff for all a e S' 
and all s e S, there are s\ , S 2 e S such that s + s\a = S 2 a holds. 

Clearly, if S has an identity 1, then ( C ') is equivalent to the condition (C), which 
states that l + s\a = S 2 a holds, for each a e S' and suitable s\ , S 2 e S. 
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Example A.2. 1 Let Q + be the set of all non-negative rational numbers. Then 
(Q + , +, •) with usual operations is a semiring with 1 as identity satisfying con- 
dition (C). The same is true, more generally, for each positive cone P of a totally 
ordered skew fields (Fuchs 1963). 

Example A.2. 2 Let N be the set of all non-negative integers. Define a + b = 
ma x{a, b} and denote by ab the usual multiplication. Then (N, +, •) is a semiring 
with 1 as identity which satisfies (C), since 1 + a = a holds for all a £ N + . 

Lemma A. 2.1 If a semiring S with an absorbing zero 0 satisfies condition 0 C ' ), 
then ab = 0 for a,b £ S implies a = 0 or b = 0. 

Proof By way of contradiction, assume ab = 0 and a / 0 / k. Then s + s\a = s^a 
according to ( C ' ) yields sb + s\ab = s^ab, i.e., sb = 0 for all s e S. Consequently 
x + s^b = s^b implies x = 0, for all 53 , S 4 £ S , which contradicts (C') applied to the 
element b e S'. □ 

Theorem A.2.2 Let S be a semiring. Then condition ( C ' ) implies that S contains 
only trivial k-ideals. The converse is true if (S, •) is commutative and provided that 
S has an element 0, then Sa = {sa : s £ 5} ^ {0} holds for all a e S' . 

Proof Assume that S satisfies ( C Let A be a k-ideal of S which contains at least 
one element a e S'. Then s + s\a = s^a according to (C') implies s £ A, for each 
s £ S, i.e., A = S. For the converse, our supplementary assumption on S shows that 
Sa is an ideal of S and that Sa / {0} holds for each a e S' if S has an element 0. 
We assume that S has only trivial k-ideals. Then the k-ideal Sa coincides with S 
for each a e S', regardless whether S has an element 0 or not. Now Sa = {s e S : 
s + s\a = S 2 a for some 57 e S} = S states that S satisfies condition ( C ' ). □ 

Theorem A.2.3 Let S be a commutative semiring with identity 1 and A a proper 
k-ideal of S. Then A is maximal iff the semiring S/A = S/pA satisfies condition 
(C), where pA is the Bourne congruence on S. 

Proof Suppose A is a maximal k-ideal of S. Then A is the absorbing zero of S/A 
and 1 pa its identity. Consider any cpa £ (S/ A)'. Then c $ A holds and the small- 
est ideal B of S containing c and A consists of all elements sc, a and sc + a 
for s £ S and a e A. From A ^ it follows B = S and hence 1 + b\ = b^ 
for suitable elements b\,b 2 £ B. If b\ = s\c + a\ and b^ — s^c + <22 for suit- 
able Sf £ S and ai £ A, then we obtain 1 + s\c + a\ = s^c + 02 . This implies 
IpA T (s\pa)(cpa) = (s 2 Pa)(cpa)- Discussions of other cases are similar. As a 
result S/ pa satisfies condition (C). 

Conversely, assume (C) for S/A and let B be a k-ideal of S satisfying A^B. 
Then there is an element c £ B/ A, and cpA £ ( S\A )' yields 1 Pa + ( MPaX^Pa ) = 
(, S 2 Pa){cpa ) for suitable 57 £ S by (C). Hence (1 + s\c)pa = fee) P a- Conse- 
quently, 1 + s\c + a\ — S 2 C a 2 holds for some a[ £ A. Hence 1 + b\ = £>2 holds 
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for some b\, Z?2 £ B. This implies 1 e B. Consequently, B = S. This shows that A 
is a maximal &-ideal of S'. □ 

[For more results, see Weinert et al. 1996]. 

In the rest of this section we only consider semirings S in which addition is 
commutative. 

Definition A.2.2 An ideal A of a semirings S is said to be completely prime iff 
ab e A implies a e A or b e B, for any a,b e S. 

Proposition A.2.1 Let S be a commutative semiring with identity. Then each maxi- 
mal k-ideal A of S is completely prime. 

Proof By Theorem A.2.3, the semiring S/A satisfies the conditions (C) and hence 
condition (C f ) (see Definition A.2.1). Since S/A has A as its absorbing zero, we 
can apply Lemma A.2.1 and find that S/A has no zero divisors. Hence apA / A 
and bp a ^ A imply ( ab)pA A, i.e., a A and b A imply ab A, where pa is 
the Bourne Congruence defined in Sect. A. 1 . □ 

Concerning the converse of Proposition B.2.1, we show that a completely prime 
ideal A of a commutative semiring S with identity needs not be a /: -ideal, and if it 
is one, A needs not be a maximal k -ideal of S. 

Example A.2.3 (a) Let S be the set of all real numbers a satisfying 0 < a < 1 and 
define a + b = a-b = min {a, /?}, Va, b e S. Then ( S , +, •) is a commutative semiring 
with 1 as identity. Each real number r such that 0 < r < 1 defines an ideal A = {a e 
S : a < r} of S which is completely prime. However, r + 1 = r together with r e A 
and 1 ^ A show that A is not a k -ideal of S. The same is true if one includes 0 in 
these considerations (in this case 0 is an absorbing element but not a zero of S U {0}), 
but also if one adjoins 0 as an absorbing zero to S'. 

(b) The polynomial ring Z[x ] over the ring Z of integers contains the subsemiring 

S = N[v] = I f{x) = ^^ a iX l • a i e N 
l i = l 

which is commutative and has 1 e N as its identity. The ideal A = (x) of S consists 
of all f(x)eS such that ao = 0 holds. Then A is completely prime and a &-ideal of 
S. Now consider the set B = {/(x) e S : ao is divisible by 2}. Then B is a k -ideal 
of S, and A^B^S^A is not a maximal £-ideal. 

All maximal fc-ideals of the semiring N are described below: 

Proposition A.2.2 Let (N, +, •) be the semiring of non-negative integers under 
usual operations. Then N has exactly the k-ideals (a) = {na : n e N }for each a e N. 
Consequently , the maximal k-ideals of N are given by (p) for each positive prime p. 
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Proof Clearly, each ideal (a) of N is a k-ideal. Now assume that A ^ {0} is a k- 
ideal of N. Let a be the smallest positive integer contained in A, and b any element 
of A. Then b = qa + r holds for some positive q and re N satisfying 0 < r < a. 
Since r belongs to the k -ideal A, it follows that r — 0, and hence A = (a). The last 
statement follows, since (a) C (b) O b\a. □ 

Remark 1 None of the maximal k -ideals (p) of N is a maximal ideal of N. This 
follows since each ideal A = (p) is properly contained in the proper ideal B = {b e 
N : Z? > p} of N. 

Remark 2 Let A = (p), p is prime in Remark 1 and ta be the hull-kernel topology 
defined in A. Then 

(a) (A, ta) is a connected space, 

(b) (A, ta) is compact. 

(See Proposition A.6.1.) 

Recall that a semiring S is said to be additively regular iff for each a e S, 3 an 
element b e S such that a = a + b + a. If in addition, the element b is unique 
and satisfies b = b + a + b, then S is called an additively inverse semiring. In an 
additively inverse semiring, the unique inverse b of an element a is usually denoted 
by a'. 

Exercise (Karvellas 1974) Let S be an additively inverse semiring. Then 

(i) x = (x'f , (x + y)' = / + x ', {xyY = x'y = xy' and xy = x'y' Vx, y e S. 

(ii) £+ = {xeS':x+x=x}isan additively commutative semilattice and an ideal 
of 5. 

Definition A.2.3 Let S be an additively inverse semiring in which addition is com- 
mutative and £ + denote the set of all additive idempotents of S. A left k -ideal A of 
S is said to be full iff E + c A. A right k-ideal of S is defined dually. A non-empty 
subset I of S is called a full k -ideal iff I is both a left and a right full k -ideal. 

Example A.2.4 (a) In a ring every ring ideal is a full k -ideal. 

(b) In a distributive lattice with more than two elements, a proper ideal is a /c -ideal 
but not a full k -ideal. 

(c) Z x N+ = {(a,b) : a,b are integers and b > 0}. Define (a, b) + (c, d) = 
(i a + c, lcm(Z?, d)) and ( a,b)(c,d ) = (ac, gcd(Z?, d)). Then Z x N + becomes an 
additively inverse semiring in which addition is commutative. Let A = {(a,b) e 
Z x N + : a = 0}. Then A is a full k-ideal of Z x N + . 


A.3 Ring Congruences and Their Characterization 

We now define a ring congruence and characterize those ring congruences p on 
additively inverse semirings S in which addition is commutative and is such that 


592 


A Some Aspects of Semirings 


— {ap) = a' p, where a' denotes the inverse of a in S and — (ap) denotes the additive 
inverse of ap in the ring S/p. 

Definition A. 3.1 A congruence p on a semiring S is called a ring congruence iff 
the quotient semiring S/p is a ring. 

Theorem A.3.1 Let A be a full k -ideal of S. Then the relation pa = {(a, b) e S x S : 
a + b r G A} is a ring congruence of S such that — (apA) = a' Pa- 

Proof Since a + a' g E+ c A for all a g S, it follows that pA is reflexive. Let 
a + b' g A. Now from W.O. Ex. 1(a), we find that (a + b'Y g A. Then b + a' = 

{b'Y + a' = (a + b'Y g A. Hence pa is symmetric. Let a + b' g A and b + c' g A. 

Then a + b + b' + c' e A. Also b + b' g £ + c A. Since A is a &-ideal, we find that 
a + c' e A. Hence pa is an equivalence relation. Let (a, b) g pa and c g S. Then 
a + b' g A. Since 

(c T < 2 ) T (c T = c T a T b' T = (<2 T Z/) T (c T g A , 

c <2 + (cZ?/ = ca + cb' = c(a + Z/) g A and 
*2c T {be)' = <2c T Z/c = H- Z/)c g A, 

it follows that pA is a congruence on S'. So we obtain the quotient semiring where 
addition and multiplication are defined by 

ap A + bp A = {a + b)p A and (, ap A ){bp A ) = ( ab)p A - 


Now 


ap A + bp A = (a + b) p A = (b + a)p A = bp A + ap A - 

Let £ g E + and a e S. Now (e + < 2 ) + < 2 r = e + (a + a 7 ) g E + . 

We find that (e + < 2 ) Pa = ®Pa- Then epA + apA = apA- 
Also 

ap A + a p A = (a+ a)p A = ep A . 

Hence ep a is the zero element and a' p a is the negative element of ap a in the ring 
S/PA- □ 

Theorem A.3.2 Let p be a congruence of S such that S/p is a ring and — (ap) = 
a'p. Then there exists a full k-ideal A of S such that pa = P- 

Proof Let A = {a g S : (a, e) e p for some e e E + }. Since p is reflexive, it follows 
that E+ c A. Then A / 0, since E+ / 0. Let a,b G A. Then there exist e, f G E+ 
such that (< 2 , e) G p and (Z?, /) G p. Then (a + b, e + /) G p. But e + / G E+ . Hence 
a + b G A. Again for any r g S, (r< 2 , re) g p and {ar, er ) g p. But re and er g E + . 
Hence A is an ideal of S. Let a + b g A and Z? g A. Then there exist e, f e E + such 
that (a + b, f) G p and (Z?, e) G p. Hence fp = (a + b)p = ap + bp = ap + ep. 
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But fp and ep are additive idempotents in the ring S/p. Hence ep = fp is the 
zero element of S/p. As a result ap is the zero element of S/p. Then ap = ep. This 
implies a £ A. So we find that A is a full k-ideal of S. Consider now the congruences 
Pa and p. Let (a, b) £ p. Then (a + b r , b + b f ) £ p. But b + b' £ E + . Hence a + b' £ 
A and (a,b) e pa- Conversely, suppose that (a,b) e pa- Then a + b' £ A. Hence 
(a + b', e) £ p for some e £ E + . As a result, ep = ap + b’ p = ap — bp holds in the 
ring S/p. But ep is the zero element of S/p. Consequently ap = bp. This show that 
(i a , b) £ p and hence p A = p. □ 

Remark (a) Certain types of ring congruences on an additive inverse semiring are 
characterized with the help of full k-ideals (see Theorems A.3.1 and A.3.2). 

(b) It is shown that the set of all full & -ideals of an additively inverse semiring in 
which addition is commutative forms a complete lattice which is also modular (see 
W.O. Ex. 1(d)). 


A.4 A>Regular Semirings 

Definition A.4.1 Let S be an additively commutative semiring and a £ S. Then a 
is called regular (in the sense of Von Neuman) if a = ax a holds for some x e S and 
k -regular if there are v, y £ S satisfying a + ay a = axa. If all the elements of S 
have the corresponding property, S is called a regular or a k -regular semiring. Since 
ay a can be added to both sides of a — axa , each regular semiring is also k -regular. 
If S happens to be a ring, both concepts coincide and each (additively commutative) 
semifield S is regular. 

Example A.4.1 For an additively commutative semiring S with an identity 1, a con- 
dition C’ is introduced in Definition A.2.1 by the property that each element a £ S 
which is not multiplicatively absorbing zero satisfies 1 + s\a = s^a for suitable ele- 
ments Si £ S. This condition implies that S has only trivial k-ideals, and the converse 
holds if (S, •) is commutative (see Theorem A.2.2). Clearly, each semiring of this 
kind is k-regular, but not necessarily regular. 

Lemma A.4.1 Let S be an additively commutative and cancellative semiring and 
R = D(S) its difference ring. If R is regular ; then S is k-regular. The converse holds 
if S is semisubtr active , but not in general. 

Proof Let R be regular and a £ S. Then a = ar a holds for a suitable element r £ R, 
say r = x — y for x, y £ S, which yields a + ay a = axa. Now assume that S is 
k-regular and semisubtractive. Then each r ^ 0 of R satisfies r = a or r = — a 
for some a £ S, and a + ay a = axa for some x, y £ S implies a = a(x — y)a or 
— a = a{y — x)a in R. For the last statement, let Q [z] be the polynomial ring over 
the field of rational numbers and R = Q[z]/(z 2 ). For brevity, we may write a + bz 
for all elements of R. It was shown in Weinert (1963) that S = {a + bz £ R : a > 0} 
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is a subsemifield of R such that R = D(S) holds. Hence S is even regular, whereas 
zrz = 0 for all r e R shows that R is not regular. □ 

Note that S can be replaced by S U {0} in this example. 

Example A A. 2 Let M n , n (Ho) be the semiring of all n x n matrices over the semi- 
field Hq of all non-negative rational numbers. Then M njfl (Ho) is k -regular, since the 
matrix ring M n , n ( Q) = D(M n , n (Ho)) is well known to be regular. Clearly, Ho may 
be replaced in this example by any semifield S such that D(S ) is a field [see Weinert 
(1963)]. 

Lemma A.4.2 Let S be a k -regular semiring. Then AC\ B = BA holds for each left 
k-ideal A and each right k-ideal B of S. 

Proof From BA c A Pi B, we get BA c A and BA c B = B. Hence BA c A Pi B. 
To show the reverse inclusion, assume a e A D B. Then aza = a(za) e BA for each 
z G S and a + aya = axa imply a e BA. □ 

Theorem A.4.1 For such a hemiring S , the set K(S ) of all k-ideals of S forms a 
complete lattice (K (S), c). This lattice is distributive if 

(2.1) A (IB = AB holds for all k-ideals A = A and B = B ofS , hence in particular 
if the hemiring is k-regular. 

Proof The intersection of any set of k-ideals of S is known to be a k-ideal of S , 
and S is a k-ideal of S. This yields the result that (. K(S ) c) is a complete lattice, 
for which AaB = AHB is the infimum and A V 5 = A + 5 is the supremum 
of A, 5 g ^r(S). Moreover, as in each lattice, A A (5 v C) D (A A B) v (A A C) 
holds for all A , # , C e AT (5). So it remains to show the reverse inclusion. Now A A 
(B v C) = AH(B -\- C) = A(B + C) holds by (2.1). Suppose x e A(B + C). Then 
x + u = v for some u, v e A(B + C). Hence u = a\y\ and t; = < 22^2 for suitable 
at e A and yt e B + C. Then 

yi + b\ + ci = Z ?2 + C 2 and y 2 + ^3 + C 3 = Z 74 + C 4 

for suitable bi e B and c/ eC. Thus 

a\y\ + a\b\ + a\c\ = a\b 2 + a\C 2 yields u + a\b\ + a\c\ = a\b 2 + a\C 2 , 

which states u e AB AC. Likewise we obtain v e AB -\- AC and thus x e 
AB + AC for x + u = v, since AB + AC is k-closed. Now A A (B V C) c 
A5 + AC and A5 + AC c A5 + AC = (A n 5) + (A n C) = (A A B) V (A A C) 
complete the proof that the lattice ( K(S ), c) is distributive. 

We show in this context that, for each hemiring S , the condition (2.1) is equiva- 
lent to 

(2.2) A = A A holds for each k-ideal A = A of S'. 
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Clearly, (2.1) for A = B yields (2.2), and AB c A H B implies for k-ideals al- 
ways AB c A fl B = A fl B . Applying (2.2) to the £-ideal A n B, we get the other 
inclusion of (2.1) by A n B = (A D B)(A Pi B) c A£. □ 

Remark (2.2) follows from the (obviously stronger) condition that A = A 2 holds for 
all ideals of a hemiring S. The latter, for instance is satisfied if S is regular, implied 
by a result of Ashan (1993) that the complete lattice of all ideals of S is Brouwerian 
and thus distributive. However, none of the sufficient conditions for distributivity 
mentioned so far is necessary. 

Example A A. 3 One easily checks that the set S = {0 ,a,b} becomes a hemiring 
according to the following tables. 


+ 

0 

a 

b 

• 

0 

a 

b 

0 

0 

a 

b 

0 

0 

0 

0 

a 

a 

a 

b 

a 

0 

0 

0 

b 

b 

b 

b 

b 

0 

0 

b 


This hemiring has the fc-ideals {0} C {0, a) C S and one more ideal {0, b} whose 
k-closure is S. Hence (K(S), c) and obviously also the lattice of all ideals of S are 
distributive. However, {0, a} 2 = {0} disproves (2.2) and thus ^-regularity and clearly 
A 2 — A for all ideals A of S. 

Finally considering any ring as a hemiring, the fc-ideals of R are just the ring- 
theoretical ideals of R , well known to form a modular lattice. So one could suspect 
that the latter holds also for the lattice K ( S ) for each hemiring S. But the following 
example shows that the complete lattice ( K(S ), c) of all k-ideals of a hemiring S 
does not need to be modular. 

Remark For each /: -regular hemiring S, the set K(S ) of all fc-ideals of S forms 
a complete distributive lattice. In general, however, the lattice of all k -ideals of a 
hemiring does not need to be even modular. 

Example AAA Consider the set S = {0, a, b, s, c, d}. We define (S, +) by the table 


+ 

0 

a 

b 

s 

c 

d 

0 

0 

a 

b 

s 

c 

d 

a 

a 

a 

s 

s 

d 

d 

b 

b 

s 

b 

s 

d 

d 

s 

s 

s 

s 

s 

d 

d 

c 

c 

d 

d 

d 

c 

d 

d 

d 

d 

d 

d 

d 

d 


and vy = 0 for all x, y e S. Instead of dealing with associativity of (S, +) directly, 
note that (^{O}, +) is the commutative and idempotent semigroup generated by 
{a, b, c, d) subject to relations a + c = d and b + c = d. Clearly, s is shorting for 
a + b. Now one checks that the subsets of S as shown in Fig. A.l are k-ideals of this 
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Fig. A.l ^-ideals of S 



{ 0 } 


Fig. A.2 Commutativity of 
the ring completion diagram 



S >- R 


semiring. By a well known criterion, {0, a, b, s} D {0, c) = {0} and {0, a) V {0, c } = 
{0, a} + {0, c } = {0,a,c,d} = S show that ( K ( S ) c) is not modular. 

Remark For one sided h - ideals and k -ideals in semirings, see Weinert et al. (1996). 


A.5 The Ring Completion of a Semiring 

The concept of a ring completion of a semiring arises through the process of passing 
from the semiring of non-negative integers to yield the ring of integers. This concept 
is used to study the stability properties of vector bundles (see Husemoller 1966, 
Chap. 8). 

Definition A.5.1 Let S be a semiring with 0 (absorbing). The ring completion of S 
is a pair (S° , /), where S 0 is a ring and / : S — >► S° is a semiring homomorphism 
such that for any semiring homomorphism h : S — ► R, where R is an arbitrary ring, 
there exists a unique ring homomorphism g : S° — >► R such that g o / = h, i.e., the 
diagram in Fig. A.2 is commutative. 

We now prescribe a construction of S ° . 

Construction of S ° : Consider the pairs (a,b) e S x S and define an equivalence 
relation p on these pairs: (< a , b) and (c, d) are said to be p -equivalent iff there exists 
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Fig. A.3 Commutative 
diagram for ring completion 


S g h 




some l e S such that a + d-\-l = c + Z? + / . Denote the p -equivalent class of ( a , b) by 
(a,b). We may consider (a,b) = a — b. Let S° be the set of all p-equivalence class 
(a, b). Define ( a , b) + (c, d) = ( a + c,b + d ) and (a, b) • (c, d) = ( ac + bd , bc + ad). 
These compositions are independent of the choice of the representatives of classes 
and hence are well defined. Clearly, (S' 0 , +, •) is a ring with its zero element 0 = 
(0, 0) and negative of ( a , b) being (b, a). The homomorphism / : S — >► S° is defined 
by f(x) = (x,0). 

Uniqueness of (S°, /): Let (SJ\ f\) be another ring completion of (S°, /). Then 
there exist ring homomorphisms g : S° — >► S^ and /z : S^ — >► S° making the diagram 
in Fig. A.3 commutative. 

It follows from the commutativity of the diagram that both g oh and h o g are 
identity homomorphisms of rings. Hence the rings S° and SJ 5 are isomorphic. 

Remark 1 S° may be viewed as the free abelian group generated by the set S mod- 
ulo the subgroup generated by (a + b) + (— 1 )a + (— 1 )b, a,b e S. 

Remark 2 The process of passing from a semigroup to a group is analogous and 
yields its group completion. 


A.6 Structure Spaces of Semirings 

The structure spaces of semirings, formed by the class of prime k-ideals and prime 
full k-ideals are considered in this section. More precisely, the properties such as 
separation axioms, compactness and connectedness in these structure spaces are 
studied. Finally, these properties for the semiring of non-negative integers are ex- 
amined. 

In this section we only consider semirings S for which ( S , +) is commutative. 

Definition A.6.1 Let A / 0 be any subset of a semiring S. Then the ideal generated 
by A is the intersection of all ideals I of S containing A. 

We recall the definition of hull-kernel topology and some of its properties (Gill- 
man 1957; Kohis 1957; Slowikowski and Zawadowski 1955). 
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Definition A.6.2 Let A be the set of all proper prime ideals of a commutative ring 
with identity. For any subset A of A, A given by A = {/ e A: p|/ c x eA I® — 1} gives 
A i-> A a closure operator defining some topology x A called the hull-kernel topol- 
ogy on A. The hull-kernel topology on the set A of all proper prime k-ideals (full 
k-ideals) of a commutative semiring with identity is defined in a similar way. 

Proposition A.6.1 

(a) (i) ACA, 

(ii) A = I, 

(iii) ACB^ACfi, and 

(iv) AU B = AU B for all subsets A , B of A 

(b) (i) For any commutative semiring S with identity 1 , let A be the set of all 

proper prime k-ideals of S and for each a e S, A(a) = {I e A : a e /}. If 
cA(a) = A \ A(a) then {cA(a) : a e S} forms an open base for the hull- 
kernel topology on A. 

(ii) ( A , t a ) is a T\-space iff no element of A is contained in any other element 

of A. 

(c) Let AA be the set of all maximal k-ideals of S. Then (AA, z M ) is a T\-space , 
where z M is the induced topology on AA from ( A , z A ). 

(d) For the semiring N of non-negative integers , let A = (p), p is prime and A be 
the set of all prime ideals A of N. Then 

(i) ( A , z A ) is a connected space ; 

(ii) ( A , z A ) is compact ; 

where z A is the hull-kernel topology on A. 

Proof Left as an exercise. □ 

We now study further properties of the structure spaces. Let I be any k -ideal 
of S. Define A(I) = {/' eA :I c /'}. 

Proposition A.6.2 Any closed set in A is of the form A(I ), where I is a k-ideal 

of S. 

Proof Let A be any closed set in A, where Ac A Let A = {I a : a e A} and I = 
n a€A Ia.ThcnA = A(I). □ 

Theorem A.6.1 (^4, r A ) is a Hausdorjf space if and only if for any distinct pair of 
elements /, J of A, there exist a, b e S such that a J , b ^ I and there does not 
exist any element K of A such that a K and b K. 

Proof Let ( A , z A ) be Hausdorff. Then for any pair of distinct elements I, J of A 
there exist basic open sets cA(a) and cA(b) such that I e cA(a), J e cA(b) and 
cA(a) (T cA(b) = 0. It follows that a $ I , b J and for any prime k-ideal K of S 



A.6 Structure Spaces of Semirings 


599 


for which a,b g K implies K e cA(a) D cA(b ), a contradiction, since cA(a) D 
cA(b) = 0. 

Conversely, let the given condition hold and /, J e A, I A J - Let a, b e S be 
such that a & I,b & J and there does not exist any K of A such that a ^ K, b ^ K . 
Then I e cA(a), J e cA(b) and cA(a) Pi cA(b) = 0, which proves that (A, r A ) is 
Hausdorff. □ 

Corollary If (A, r A ) is a T 2 ~space , then no proper prime k-ideal contains any other 
proper prime k-ideal. If (. A , r A ) contains more than one element , then there exist 
a,b e S such that A = cA(a) U cA(b) Ud(/), where I is the k-ideal generated by 
a , b. 

Proof Suppose (A, r A ) is a T^-space. Since every T^-space is a T\ -space, (^4, x A ) is 
a T\ -space. Hence by Proposition A.6.1 no proper prime &-ideal contains any other 
proper prime £-ideal. Now let /, / e A, where I A J . Then by Theorem A.6.1, 
there exist a, b in S such that a A b,a I,b J and cA(a) H cA(b) = 0. Let I 
be the £-ideal generated by a, b. Then since cA(a) D A(b) = fl. Any element of A, 
belongs to cA(a) or cA(b) or A(/), and therefore cA(a) U cA{b) U A(I) = A. □ 

Theorem A.6.2 (A, z A ) is a regular space if and only if for any I g A and a $ I, 
a e S, there exist a k-ideal J of S and b e S such that I e cA(b) c A(J) c cA(a). 

Proof Let (^4, r A ) be a regular space. Then for any I e A and any closed set A(J) 
not containing /, there exist disjoint open sets U, V such that I e U and A{J) c V. 
If a & I, then I e cA(a) and A \ cA(a) is a closed set not containing I. Hence U, V 
are disjoint open sets containing I and A\cA(a), respectively. Then there exists 
b e S such that I e cA(b) c A(J) c cA(a), where / is a k - ideal of S. 

Conversely, let the given condition hold. Let I e A and A(K) be any closed 
set not containing I. Let a & I, a e K. Then by the given condition, there exist a 
£-ideal / and an element b e S such that I e cA(b) c A(J) c cA(a). Obviously, 
cA(a) D A(K) = 0. So, cA(b) and A \ A(J) are two disjoint open sets containing 
I and A(K ), respectively. Consequently, ( A , r A ) is a regular space. □ 

Theorem A.6.3 (^4, r A ) is a compact space if and only if for any collection {at } of 
elements of S there exists a finite number of elements a\,a 2 , ... ,a r in S such that 
for any I e A, there exists at such that ai $ I. 

Proof Let (^4, r A ) be a compact space. Then the open cover { cA(a ) : a e S] has a 
finite subcover {cA(at) : i = 1, . . . , r}. Thus if I e A, then I e cA(af) for some i 
which implies that ai /. Hence at, ...,a r are the required finite number of ele- 
ments. 

The converse follows from Kohis (1957, Lemma 3.1). □ 

Corollary If S is finitely generated , then (. A , r A ) is compact. 
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Proof Let {at : i = 1, . . . , r] be a set of generators of S. Then for any I e A, there 
exists ai such that ai g I, since I is a proper k -ideal. Hence by Theorem A.6.3. 
(A r A ) is compact. □ 

Proposition A.6.3 Let ( S , r s ) be the space of all proper prime full k-ideals of S. 
Then ( S , t s ) is compact if and only if E + = {x e S : x + x = x} {0}. 

Proof Let {A (A) : a e A} be any collection of closed sets in S with finite inter- 
section property. Let I be the proper prime £-ideal which is also the full /: -ideal 
generated by £ + . Since any prime full /: -ideal J contains £ + , / contains I . Hence 

I e HaeA A (A) / 0. Consequently, (S, r s ) is compact. □ 

Definition A.6.3 A semiring S is said to be &-Noetherian if and only if it satisfies 
the ascending chain condition on ^-ideals on S i.e., if and only if for the ascending 
chain A C A c • • • c I n • • • of fc-ideals of S , there exists a positive integer m such 
that I n = I m for all n > m . 

Theorem A.6.4 If S is a k-Noetherian semiring , (A, r A ) is countably com- 

pact. 

Proof Let {A(I n )}^f l be a countable collection of closed sets in A , with finite in- 
tersection property. Let (A) denote the prime &-ideal generated by A, where A (^0) 
is any subset of A. Let us consider the following ascending chain of prime fc-ideals: 

I I c (/1U/2) c (A U/2U/3) C .... Since S' is &-Noetherian, there exists a positive 
integer m such that (A U I 2 U • • • U I n ) = (A U I 2 U • • • U I m +i ) = •••, which shows 
that (A U • • • U I m ) e D~i A(I n ) / 0. Hence (Al, r A ) is countably compact. □ 

Corollary If S is a k-Noetherian semiring and ( A , r A ) is second countable , 

(A, r A ) A compact. 

Proof It follows from Theorem A.6.4 and the fact any open cover of a second count- 
able space has a countable subcover. □ 

Theorem A.6.5 (A, r A ) is disconnected if and only if there exist a k-ideal I of S 
and a collection of points {a a )aeA of S not belonging to I such that if I' e A and 
a a g A, Va g A, then I \ V ^ 0. 

Proof Let (A, r^) be not connected. Then there exists a non-trivial open and closed 
subset of A. Let I be the /: -ideal of S for which A(/) is closed as well as open. Then 
MD = U„ 6 4 cA(a a ), where {a a } ae x is a collection of points of S. Now since 
cA(a a ) c A(/), G A for any A £ cA(a a ) we have / c A, and therefore a a $ I 
as ^ A» Va g A. Now for any P g A and g I f , Va g A we have V A(I). 
Consequently, / £ / i.e., / \ A 7^ 0. 

Conversely, let the given condition hold. Then A(/) = UaeA C A (A a ) is an open 
and closed non-trivial subset of A and hence (A, r^) is disconnected. □ 
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Proposition A.6.4 (S,z s ) is connected space if E+ A { 0}. 

Proof Let I be the proper &-ideal generated by E + . Then I belongs to any closed 
set A(I') of A, since any full &-ideal of S contains E + . Consequently any two 
closed sets of A are not disjoint. Hence ( S , r s ) is connected. □ 

Example A.6.1 Let S = N be the set semiring of non-negative integers with respect 
to usual addition and multiplication. Then the prime proper /: -ideals of S are (p) = 
{np :n e N} where p is a prime number (Sen and Adhikari 1993). Let A = {(/?) : 
p is prime} and r A be the hull-kernel topology defined in A. Since any prime proper 
k -ideal of N is not contained in other prime proper k -ideal, (A, r A ) is a T\ -space. 
Now let (p\) and ( p 2 ) be two distinct elements of A and let n \ , n 2 be two elements 
of S such that n\ A (p\) and ni A (pf), then we can always find an element ( p ) of 
A, such that n\ A ( p ) and n 2 A ( p ). In fact one can take p > n\, n 2 , p\, P 2 and p is 
prime. Hence by Theorem A.6.1, (A, r A ) is not a Hausdorff space. cA(n) is infinite 
and its complement, which is closed in A, is finite. So any two non-trivial open sets 
intersect. Consequently, (A, r A ) is neither a T^-space nor a regular space. For this 
reason ( A , r A ) is a connected space. Clearly, ( A , r A ) is compact by Theorem A.6.3. 


A.7 Worked-Out Exercises and Exercises 

Worked-Out Exercises (W.O. Ex) 

1. Let S be an additively inverse semiring in which addition is commutative and 
E + denote the set of all additive idempotents of S. 

(a) Every /c-ideal of S is an additively inverse subsemiring of S. 

(b) Let A be an ideal of S. Then 

(i) A = {a e S : a + v e A for some v e A} is a &-ideal of S such that 
A c A. 

(ii) A = A A is a k- ideal. 

(c) Let A and B be full /: -ideals of S , then A + B is a full k -ideal of S such that 
A c A + B and B c A + B. 

(d) If I(S ) denotes the set of full ^-ideals of S , then I(S) is a complete lattice 
which is also modular. 

Solution (a) Let I be a &-ideal of S. Clearly I is a subsemiring of S. Let a e I . Then 

a (a T a^j = a (E I . 

Since / is a &-ideal, if follows a' + a e I . Again this implies that a' e I . Hence (a) 
follows. 

(b) Let a,b e A. Then a + v, b + y e A for some x, y e A. Now a+x-\-b-\-y = 
(a + b) + (x + y) g A. 
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As x + y e A, a + b e A. Next let r e S, ra + rx = r(a +x)eA. 

As rx e A, ra e A. Similarly, ar € A. As a result A is an ideal of S. Next, let c 
and c + d e A. Then there exist x and y in A such that c + x e A and c + d + y e A. 
Now d + (c + x + y) = (c + d + y)+xeA and c + x + y G A. 

Hence d G A and A is a &-ideal of S. Since a + a' e A for all a g A, it follows 
that A c A. 

(c) It can be shown that A + B is an ideal of S. Then from (b), we find A + B 
is a &-ideal and A + 5CA + S. Now E + c A, B. Hence E + cA + ficA + fi. 
This implies that A + B is a full &-ideal. Let a e A. Then 

$=rz-|-£z / -|-rz = rz-b T cij G A T /? as ( 2 ^ 4- *2 G ^ 

Hence ACA + 5 and similarly 5CA + 5. 

(d) We first note that /(S') is a partially ordered set with respect to usual set 
inclusion. Let A, B g /(S). Then A Pi 5 e /(S) and from (c), A + B g /(S). Define 
A A B = A (1 B and A v /? = A + /L Let C e /(S) be such that A, B c C. Then 
A + KC and A + SCC. But C = C. Hence A + 5 c C. As a result A + /? 
is the lub of A, B. Thus we find that /(S) is a lattice. Now E + is an ideal of S. 
Hence E + e I (S) and also S g / (S); consequently /(S) is a complete lattice. Next 
suppose that A, B,C G /(S) such that 

A A B = A A C and A V 5 = A V C and SCC. 


Let x e C. Then igAvC = Av5 = A + 5. Hence there exists a + b e A + B 
such that x + a + b = a\+b\ for some a\ g A, Zq g B. 

Then x + a -\- a' + b = a\ + b\ + a' . 

Now xeC,a + a f eC and b G 5 c C. Hence a\ + Zq + a' G C. But Zq G C. 
Consequently, +a / GCnA = CHS. Hence a\+ a f € B. So from v + a + b = 

+ Zq we find that x + a-l- < 2 ' + Z? = < 21 + a' + Z?g/L But (<2 + a') + b g 5 is a 
& -ideal. Hence x e B and B = C. This proves that I (S) is a modular lattice. 

Exercises 

1 . Define homomorphism, monomorphism, epimorphism, isomorphism for semir- 
ings with the help of the corresponding definitions for semigroups and rings. 

2. A mapping / : (S, +, •) — ► (7\ +, •) is a semiring homomorphism iff both / : 
( S , +) -> (T, +) and / : ( S , •) — > (T, •) are semigroup homomorphisms. 

3. Let / : (£,+,•) — > (7\ +, •) be a semiring homomorphism. Then the homo- 
morphic image ( f(S ), +, •) of ( S , +, •) is also a semiring. If ( S , +, •) is com- 
mutative, then ( f(S ), +, •) is also commutative. If / is an isomorphism then 
f~ l is also so. 

4. Let H be a hemiring and 0 a e H and (ei, ^ 2 ) / (0, 0) be an identity pair 

of H. A pair (c,d) e H x H is called an inverse relative of the identity pair 
(e \ , ^ 2 ) iff £1 + ca = £2 + da and £1 + ac = + ad. An additively cancellative 

hemiring H is called a division hemiring iff H contains an identity pair and 
every non-zero element of H is invertible in H. A hemiring H is said to be a 
left &-Artinian (left Artinian) iff H satisfies the d.c.c for ^-ideals (ideals) of H. 
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(a) Let H = : a, b,d e R and a, J>5}u{(^)}isa division hemiring 

with ((^),(^))asa left identity pair. 

(b) A division hemiring H has no zero divisors. 

(c) A division hemiring is multiplicatively cancellative. 

(d) Let H be an additively cancellative hemiring with more than one element. 
Then H is multiplicatively cancellative k-Artinian iff H is a division hemir- 
ing. 

5. A semiring (S, +, •) satisfying |S| > 2 is called a semifield iff the set of its 
non-zero elements S' forms a multiplicative group. 

(a) Let (S, +, •) be a semifield satisfying \S'\ >2. Then the identity e of the 
group (S', •) is also the identity of (S, +, •) and (5, +, •) is multiplicatively 
cancellative. 

(b) Let (S’, +, •) be a semiring satisfying |5'| >2. Then ( S , +, •) is a semifield 
iff it satisfies one of the following statements: 

(i) (S, +, •) has a left identity e and each a e S' is invertible in (S, •); 

(ii) For arbitrary elements a e S' ,b e S, 3 elements x, y e S such that 
ax = b and ya = b. 

(c) Let (S, +, •) be a semiring satisfying \S\ >2. If (S, +, •) has an identity 
and each a e S' has a (left) inverse a' e S, then (S, +, •) is semifield. 

6. A semiring S satisfying the condition (C") of Definition A.2.1 contains at most 
one left ideal consisting of a single element. 

7. Let S = (S, +) be a semimodule and A C S a. maximal ^-closed subsemimodule 
of S. Then A is also ^-closed in S i.e., AcB = B^S^B = Sfor each k- 
closed subsemimodule B of S. 

8. If a semiring S satisfies the condition ( C ') of Definition A.2.1, it contains at 
most two left ^-ideals, which are in fact two-sided ones, namely S and possibly, 
the ideal {0} consisting of the multiplicatively absorbing element 0 of S. 

9. Let S be a semiring. The each proper /: -ideals (h- ideal) of S is contained in a 
maximal k -ideal (h -ideal) of S if 3 a finitely generated ideal I of S such that 
I = S. 

10. Give an example of multiplicatively commutative semiring S which has only 
two h -ideals (viz. Z(S) and S ), but an infinite chain of k- ideal 5/ satisfying 

A = Z(S) C #i C £2 C • C S'. 

(In particular, A = Z(S) is a maximal h- ideal of S, but not a maximal &-ideal.) 


A.8 Additional Reading 

We refer the reader to the books (Adhikari and Adhikari 2003; Adhikari and 
Das 1994; Adhikari et al. 1996; Ahsan 1993; Fuchs 1963; Gillman 1957; Glazek 
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1985; Golan 1992; Hazewinkel 1996; Hebich and Weinert 1993; Henriksen 1958; 
Husemoller 1966; Iizuka 1959; Sen and Adhikari 1992, 1993; Vandiver 1934; Wein- 
ert 1963; Weinert et al. 1996) for further details. 


References 

Adhikari, M.R., Adhikari, A.: Groups, Rings and Modules with Applications, 2nd edn. Universities 
Press, Hyderabad (2003) 

Adhikari, M.R., Das, M.K.: Structure spaces of semirings. Bull. Calcutta Math. Soc. 86, 313-317 
(1994) 

Adhikari, M.R., Sen, M.K., Weinert, H.J.: On ^-regular semirings. Bull. Calcutta Math. Soc. 88, 
141-144 (1996) 

Ahsan, J.: Fully idempotent semirings. Proc. Jpn. Acad., Ser. A 69, 185-188 (1993) 

Fuchs, L.: Partially Ordered Algebraic Systems. Addison- Wesley, Reading (1963) 

Gillman, L.: Rings with Hausdorff structure space. Fundam. Math. 45 , 1-16 (1957) 

Glazek, K.: A Short Guide Through the Literature on Semirings. Math. Inst. Univ. Wroclaw, Poland 
(1985) 

Golan, J.S.: The Theory of Semirings with Applications in Mathematics and Theoretical Computer 
Science. Longman, Harlow (1992) 

Hazewinkel, M. (ed.): Handbook of Algebra, vol. I. Elsevier, Amsterdam (1996) 

Hebich, U., Weinert, H.J.: Algebraische Theorie und Anwendungen in der Informatik. Tember, 
Stuttgart (1993) 

Henriksen, M.: Ideals in semirings with commutative addition. Not. Am. Math. Soc. 5 , 321 (1958) 
Husemoller, D.: Fibre Bundles, 2nd edn. Springer, New York (1966) 

Iizuka, K.: On the Jacobson radical of a semiring. Tohoku Math. J. 11 ( 2 ), 409-421 (1959) 
Karvellas, P.H.: Inverse semirings. J. Aust. Math. Soc. 18 , 277-288 (1974) 

Kohis, C.W.: The space of pair ideals of a ring. Fundam. Math. 45 , 1-27 (1957) 

Sen, M.K., Adhikari, M.R.: On ^-ideals of semirings. Int. J. Math. Math. Sci. 15(2), 347-350 
(1992) 

Sen, M.K., Adhikari, M.R.: On maximal ^-ideals of semirings. Proc. Am. Math. Soc. 113 ( 3 ), 699- 
703 (1993) 

Slowikowski, W., Zawadowski, W.: A generalization of maximal ideals method of Stone and 
Gelfaand. Fundam. Math. 42 , 215-231 (1955) 

Vandiver, H.S.: Note on a simply type of algebra in which cancellation law of addition does not 
hold. Bull. Am. Math. Soc. 40 , 914-920 (1934) 

Weinert, H.J.: Uber Halbringe und Halbkorper II. Acta Math. Acad. Sci. Hung. 14 , 209-227 (1963) 
Weinert, H.J., Sen, M.K., Adhikari, M.R.: One sided ^-ideals and h - ideals in semirings. Pannon. 
Hungary 7(1), 147-162 (1996) 



Appendix B 

Category Theory 


Category theory is a very important branch of modern mathematics. It has been 
growing quite rapidly both in contents and applicability to other branches of the sub- 
ject. The concepts of categories, functors, natural transformations and duality form 
the foundation of category theory. These concepts were introduced during 1942- 
1945 by S. Eilenberg and S. Mac Lane. 1 Originally, the purpose of these notions 
was to provide a technique for classifying certain concepts such as that of natural 
isomorphism. In pedagogical methods in mathematics we compartmentalize math- 
ematics into its different branches without emphasizing their interrelationships. But 
with the help of category theory one can move from one branch of mathematics to 
another. It provides a convenient language to tie together several notions and exist- 
ing results of different branches of mathematics in a unified way. This language is 
conveyed in this chapter through modem algebra, algebraic topology, topological 
algebra, sheaf theory etc. 


B.l Categories 

A category may be thought roughly as consisting of sets, possibly with additional 
structures, and functions, possibly preserving additional structures. More precisely, 
a category can be defined by the following characteristics. 

Definition B.1.1 A category C consists of 

(a) a class of objects A, Y, Z, . . . denoted by ob(C); 

(b) for each ordered pair of objects X, Y a set of morphisms with domain X and 
range Y denoted by C(X, Y) or simply (A, Y); i.e., if / e (A, Y), then A is 


1 (i) Natural isomorphism in group theory. Proc. Natl. Acad. Sci. USA 28 , 537-544 (1942) 
(ii) General theory of natural equivalence. Trans. Am. Math. Soc. 58 , 231-294 (1945) 


M.R. Adhikari, A. Adhikari, Basic Modern Algebra with Applications, 
DOI 10.1007/978-81-322-1599-8, © Springer India 2014 
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called the domain of / and 7 is called the codomain (or range) of / : one also 

/ 

writes / : X — >► Y or X — > Y to denote the morphism from X to Y ; 

(c) for each ordered triple of objects X, Y and Z and a pair of morphisms / : X — > 
Y and g :Y — > Z, their composite denoted by gf : X -> Z i.e., if / g (X, 7) 
and g g (7, Z), then their composite g/ satisfies the following two axioms: 

(i) associativity : if / g (X, 7),g g (7, Z) and h g (Z, 17), then h(gf) = 

(hg)fe(X,wy, 

(ii) identity : for each object 7 in C, there is a morphism ly g (7, 7) such that 
if / G (X, 7), then ly/ = / and if h G (7, Z), then /zly = /*. Clearly, ly 
is unique. 

If the class of objects is a set, the category is said to be small. 

Example B.l.l (i) Sets and functions form a category denoted by S. 

(Here the class of objects is the class of all sets and for sets X and 7, (X, 7) 
equals the set of functions from X to 7 and the composition has the usual meaning, 
i.e., usual composition of functions.) 

(ii) Sets and injections (or surjections or bijections) form a category. 

(iii) Groups and homomorphisms form a category denoted by Grp. 

(Here the class of objects is the class of all groups and for groups X and 7, (X, 7) 
equals the set of homomorphisms from X to 7 and the composition has the usual 
meaning.) 

(iv) Rings and homomorphisms form a category denoted by Ring. 

(v) Commutative rings and homomorphisms form a category. 

(vi) R -modules and R -homomorphisms form a category denoted by Mod/?. 

(vii) Topological spaces and continuous maps form a category denoted by Top. 

(viii) Finite sets and functions form a category. 

(ix) Given a partial ordered set (X, <), there is a category C whose objects are 
the elements of X and such that C(x, x r ) is either the singleton consisting of the 
ordered pair (x, x') or empty, according to whether x < x' or x ^ x', \ x = (jc, jc) 
and ( q,r)(x , q) = (x, r) when x <q < r. 

(Note that C(x, x') is not a set of functions.) 

(x) Given a category C, there is an opposite (dual) category C° whose objects 
7° are in one-to-one correspondence with the objects 7 of C and whose morphisms 
/0 ; 7O — > X° are in one-to-one correspondence with the morphisms / : X Y 
and for g : 7 — > Z in C, /°g° is defined by /°g° = (g/)°. 

(xi) If C 1 and C 2 are categories, their product C\ x C2 is the category whose 
objects are ordered pairs (7i, 72) of objects Y\ in C\ and 72 in C2 and whose 
morphisms (Xi, X2) — > (Y\, Y 2 ) are ordered pairs of morphisms (/1, /2), where 
/1 : X\ — > Y\ in C\ and /2 : X2 — > 72 in C2. Similarly, there is a product of an 
arbitrary indexed family of categories. 

(xii) Let C be a category. We define a category C 2 in the following way: objects 
of C 2 are the morphisms of C ; morphisms of C 2 are certain pairs of morphisms of 
C, for / g C(A, B) and g e C(C, D ), the pair (a, /3) is a morphism from / to g in 
C 2 iff a : A ^ C and /3 : B — > D satisfy the commutativity relation: fif = ga. 
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Note that as objects of C 2 and C x C are different, C 2 ^ C x C. 

(xiii) Exact sequences of R -modules and 7?-homomorphisms form a category. 

Definition B.1.2 A subcategory C' C C is a category such that 

(a) the objects of C' are also objects of C, i.e., ob (C r ) c ob(C); 

(b) for objects X' and Y' of C\ C\X\ Y f ) c C(X', Y r )\ 

(c) if /' : X' — > T' and g' : Y' — > Z r are morphisms of C', their composite in 
equals their composite in C. 

Definition B.1.3 A subcategory C' of C is said to be a full subcategory of C iff for 
objects A' and Y' in C ', C(X', Y r ) = C\X\ Y'). 

The category in Example B.l.l(ii) is a subcategory of the category in Exam- 
ple B.l.l(i) and the category in Example B.l.l(viii) is a full subcategory of the 
category in Example B.l.l(i). 

Remark The categories in Examples B . 1 . 1 (iii)— ( vii) are not subcategories of the 
category in Example B.l.l(i), because each object of one of the former categories 
consists of a set, endowed with an additional structure (hence, different objects in 
these categories may have the same underlying sets). 

In category Example B.l.l(ix), the morphisms are not functions and so this cat- 
egory is not a subcategory of the category in Example B.l.l(i). 


B.2 Special Morphisms 

Let C be a category and A, B, C, . . . objects of C. 

Definition B.2.1 A morphism / : A B in C is called a coretraction iff there is a 
morphism g : B — >► A in C such that gf = l a- In this case g is called a left inverse 
of / and / is called a right inverse of g and A is called a retract of B . 

Dually we say that / is a retraction iff there is a morphism g' : B — >► A such that 
fg' = l B in C. In this case g r is called a right inverse of /. 

Definition B.2.2 A two-sided inverse (or simply an inverse) of / is a morphism 
which is both a left inverse of / and a right inverse of /. 

Lemma B.l If f : A —> B in C has a left inverse and a right inverse , they are 
equal. 

Proof Let g f : B — > A be a left inverse of / and g" : B — >► A a right inverse of 
/, then g7 = 1 A and fg" = \ B . Now g' = g'l B = g'(fg") = ( g' f)g " = Ug" = 
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Definition B.2.3 A morphism / : A — > B is called an equivalence (or an isomor- 
phism) in a category C denoted by / : A ~ B iff there is a morphism g : B — > A 
which is a two-sided inverse of /. 

Proposition B.2.1 If f : A ^ B is a morphism in C such that f is both a retraction 
and a coretraction , then f is an equivalence. 

Proof It follows from Definition B.2.1 and Lemma B . 1 . □ 

Remark An equivalence f : A ~ B has a unique inverse denoted by f~ l : B -> A 
and f~ l is also an equivalence. 

Definition B.2.4 Two objects A and B in C are said to be equivalent iff there is an 
equivalence / : A ~ B in C. 

Remark As the composite of equivalences is an equivalence, the relation of being 
equivalent is an equivalence relation in any set of objects of a category C. 

Definition B.2.5 A morphism a e C(A, B) is called a monomorphism (or monic) 
iff af = ag =>► / = g for all pairs of morphisms /, g with codomain A and same 
domain in C. 

Definition B.2.6 A morphism a e C(A, B) is called an epimorphism (or epic) iff 
fa = got => f = g for all pairs of morphism /, g with domain B and same 
codomain in C. 

Remark The notion of an epimorphism is dual to that of a monomorphism in the 
sense that a is an epimorphism in C iff it is a monomorphism in its dual category C°. 
A coretraction is necessarily a monomorphism and a retraction is an epimorphism. 
Thus an isomorphism is both a monomorphism and an epimorphism in C. 

Proposition B.2.2 If a : A — > B is a coretraction and also an epimorphism in C, 
then it is an isomorphism in C. 

Proof As a is a coretraction, there exists a morphism f : B A such that fa = 1 a- 
Then 

(a/3)a = a(/3a) = al a = ot = Isa. (B.l) 

Since a is an epimorphism, (B.l) shows that af = 1#. Consequently a is both a 
retraction and a coretraction. Hence a is an isomorphism. □ 


B.3 Functors 

Our main interest in categories is in the maps from one category to another. Those 
maps which have the natural properties of preserving identities and composites are 


B.3 Functors 
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called functors. An algebraic representation of topology is a mapping from topology 
to algebra. Such a representation, formally called a functor, converts a topological 
problem into an algebraic one. 

Definition B.3.1 Let C and V be categories. A covariant functor (or contravariant 
functor) T from C to V consists of 

(i) an object function which assigns to every object X of C an object T (A) of V; 
and 

(ii) a morphism function which assigns to every morphism / : X — ► Y in C, a mor- 
phism T(f) : T(X) -> T(Y) (or T(f) : T(Y) -> T(X)) in V such that 

(a) T{\x) = It(X)', 

(b) T{gf) = T(g)T(f) (or T(gf) = T(f)T(g)) for g : Y -> W in C. 

Example B.3.1 (i) There is a co variant functor from the category of groups and 
homomorphisms to the category of sets and functions which assigns to every group 
its underlying set. This functor is called a forgetful functor because it forgets the 
structure of a group. 

(ii) Let R be a commutative ring. Given a fixed R -module Mo, there is a covari- 
ant functor ttm 0 (or contravariant functor 7t M °) from the category of R -modules and 
R -homomorphisms to itself which assigns to an R -module M the R -module Horn/? 
(Mo, M) (or Hom/?(M, Mo) and if a : M — ► N is an T^-module homomorphism, 
then JTM 0 ( a ) : Horn/? (Mo, M) — > Horn/? (Mo, N) is defined by i tm 0 (^)(/) = &f 
V/ e Hom R (M 0 ,M) (jr Mo (a) : Hom s (Af,M 0 ) Horn r(M,Mq) is defined by 
7T Mo (a)(f) = fa -if e Uom R (N, M 0 )). 

(iii) Let C be any category and C e ob(C). Then there is a covariant functor 
he '.C — > S (category of sets and functions) defined by he (A) = C(C, A) (set of all 
morphisms from the object C to the object A in C) V objects A e ob(C) and for / : 
A ^ B in C, Ac(/) : h c (A) -> Ac(5) is defined by h c (f)(g) = fgVg e h c (A) 
(the right hand side is the composite of morphisms in C). 

Its dual functor h c defined in an usual manner is a contravariant functor. 

(iv) For any category C there is a contravariant functor to its opposite category C° 
which assigns to an object X of C the object X° of C° and to a morphism / : X — > Y 
in C the morphism f° : Y° -+ X° in C°. 

Remark A functor from a category C to itself is sometimes called a functor on C. 
Any contravariant functor on C corresponds to a covariant functor on C° and vice 
versa. Thus any functor can be regarded as a covariant (or contravariant) functor on 
a suitable category. In spite of this, we consider covariant as well as contravariant 
functors on C. 

Definition B.3.2 A functor T : C — >► V is called 

(i) faithful iff the mapping T : C(A, B) V(T (A), T ( B )) is injective; 

(ii) full iff the mapping T : C(A, B) V(T (A), T ( B )) is surjective; and 

(iii) an embedding iff T is faithful and T (A) = T ( B ) =>- A = B. 
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Definition B.3.3 A category C is called concrete iff there is a faithful functor T : 
C^S. 


B.4 Natural Transformations 

In some occasions we have to compare functors with each other. We do this by 
means of suitable maps between functors. 

Definition B.4.1 Let C and V be categories. Suppose T\ and T 2 are functors of the 
same variance (either both co variant or both contravariant) from C to V. A natural 
transformation 0 from T\ to T 2 is a function from the objects of C to morphisms of 
V such that for every morphism / : A — > Y in C the appropriate one of the following 
conditions hold: 

0(7) 7i(/) = 72(/)0(X) (when T\ and T 2 are both covariant functors) 
or 

0(A)7\(/) = 72(/)0(7) (when T\ and T 2 are both contravariant functors). 

Definition B.4.2 Let C and V be categories and 7j , T 2 be functors of the same 
variance from C to V. If 0 is a natural transformation from 7j to T 2 such that 
0(A) is an equivalence in V for each object A in C, then 0 is called a natural 
equivalence. 

Example B.4.1 Let R be a commutative ring and Mod be the category of R- 
modules and R -homomorphism, M and N be objects in Mod. Suppose g : M N 
is a morphism in Mod. So by Example B.3.1(ii), ttm , tt n are both covariant functors 
and tv m , tv N are both contravariant functors from Mod to itself. Then there exists a 
natural transformation g* : ► ttm, where g*(A) : n^{X) -> tvm(X) is defined 

by g*(A)(/z) = hg for every object A in Mod and for all /* g 7r^(A); and a natural 
transformation 


g* • tvn, where g*(A) is defined in an analogous manner. 

If g is an equivalence in Mod, then both the natural transformations g* and g* 
are natural equivalences. 

Theorem B.4.1 (Yoneda’s Lemma) Let C be any category and T a covariant func- 
tor from C to S ( category of sets and functions). Then for any object C in C, there 
is an equivalence 0 = 6c j • 0c, T) — >► T (C), where (j he , T) is the class of natural 
transformations from the set valued functor he to the set valued functor T such that 
6 is natural in C and T . 
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Fig. B.l Commutativity of 
the rectangle for natural 
transformation p 


hc(C) 


r](C) 


hcif) 
rj(X) 

h C (X) > 


T(C ) 

T (/) 

I 

T(X) 


Fig. B.2 Commutativity of 
the rectangle for natural 
transformation p 


he (g) 

h C (X) 

p(x)(X) 

' T(g ) 

T(X) 


hc(Y) 

p(x)(Y) 

T(Y ) 


Proof We claim that every object C in C, (he, T) is a set. Let p : he -> T be a 
natural transformation. 

Then for each / : C — >► X in C the diagram in Fig. B.l is commutative. 

Hence 

r(/)(ij(C))(lc) = (r(/)»j(C))(lc) = i?(X)Ac(/)dc) = »?(X)(/) (B. 2 ) 

since hci\c) = /lc = / (see Example B.3.1(iii)). 

This shows that for each X , the function p (X) is completely determined by the 
element p(C)(lc) e T (C). The latter being a set, so is (/*c, T). 

Having dealt with this part we proceed with the main part. 

We define 

0 : (Ac, T) -* r(C) by ©( 17 ) = ^(C)(l c ) e T(C) e (A c , T). (B.3) 

We now define a function 


P : T{C) -> (Ac, r) by p(x)(X)(f) = T(f)(x) 
Wx e T (C), leCand/G C(C, X). 


(B.4) 


Then p(v) e ( hc,T ). Now from the definition of p it follows that p(x)(X) : 
C(C, X) -> T (X) is a function in 5, because for / e C(C, X), T (/) e S(T (C), 
T(X)) => T(/)(x) e r(X) V* e T(C). 

Now for g : X — > 7 in C, the diagram in Fig. B.2 is commutative. 

This is because 


p(x)(Y)h c (g)(f ) = p(x)(Y)(gf) = T (gf)(x) = T(g){T(f)(x)) 
= T (g)p(x)(X)(f) Vfehc(X). 


Consequently, p(x) is a natural transformation from he to T . 
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Fig. B.3 Commutativity of 
the rectangle for naturality of 
0 in T 


0=@C,T 

(he, T) >- T(C) 


N*(a) 

6=&c,s 

(he, S) 


a(C) 

S(C ) 


Fig. B.4 Commutativity of 
the rectangle for naturality of 
0 in C 


@c=0c,t 

(h c ,T) »• T(C) 


N*(f) 

0d=0d,t 
(h D , T) 


T{f) 

V 

T(D) 


Now 


(pG)(r l )=p{p(r 1 j)=p(r ] (C)(\cj) 

=> (pO)(r ] )(X)(f) = p(r ] (C)(lc))(X)(f) = Tf(r ] (C)(lc)) (B.5) 

by (B.4) = rj(X)(f) by (B.2) VXeC and V/ e C(C, X) =t> p6 = identity. 


Also, 


Vx g T(C), 
by (B.3) 


(9p)(x) =e(p(x)) = p(x)(C)(l c ) 

T (1 c)(X) by (B.4) = 1 T(C)(x) =x^6p= identity. 


(B.6) 


Consequently, 0 is an equivalence. 

To show that 0 is natural in T, we have to prove that the diagram in Fig. B.3 
is commutative, where a : T S is any natural transformation from the set val- 
ued functor T to the set valued functor S and N*(a) : (hc,T) — > (hc,S) is 
defined by N*(a)(rj) = arj, the latter is given by (otrj)(X) = a(X)rj(X) VA e 
C. Now a(C)0(ri) = a(C)rj(C)( l c ) = (arj)(C)(l c ) and 0N^(a)(rj) = 0(ar]) = 
(ar])(C)(lc) => the above diagram commutes =>► 0 is natural in T . 

To show that 0 is natural in C, we have to prove that the diagram in Fig. B.4 is 
commutative. 

For every morphism / : C D in C, where for each X eC and r) e (r]c, T), 


N*(f)(rj)(X ) : h D (X) -> T(X) is defined by 
N*(f)(r!(X))(g) = r,(X)(gf) Vg e h D (X). 

Chasing the element r\ e ( he , T) anticlockwise, we have 


OoN^fm = (N*(fm(D)) (1 D ) = ij(D)(/) 


(B.7) 
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Fig. B.5 Commutativity of 
the rectangle for naturality 
of T] 


1(0 

hc(C) T(C) 

h C (f ) 

r)(D) 

h C (D ) ^ T(D) 


T (/) 


and chasing r] clockwise, we have 

T(f)O c (rj) = T(f)ri(C)(l c ). (B.8) 

Again by the naturality of r\, the diagram in Fig. B.5 is commutative. 

Hence 


T(fMC)(lc) = rj(D)(h c m C ) = rj(D)(flc) = r](D)(f) (B.9) 

Hence (B.7)-(B.9) show that 0 is natural in T. □ 

Remark For the dual result of the theorem, see Exercise 9. 

Example B.4. 2 Let Grp be the category of groups and homomorphisms S be the 
category of sets and functions and S : Grp — > S be the forgetful functor which as- 
signs to each group G its underlying set SG. Then 

(i) there is a natural equivalence from the covariant functor hz to the covariant 
functor S; and 

(ii) there is an equivalence 0 : ( S , S ) — > SZ. 

[Hint, (i) Let G be an arbitrary object in Grp and rj : hz — ► S a natural transfor- 
mation. 

Define rj(G) : h z (G) -> SG by 

ri(G)(f) = f( 1) V/ e r]z(G) = Hom(Z, G), 

where 1 is the generator of the infinite cyclic group Z. 

Again define p(G) : SG — >► hz(G) by p(G)(x) = /, where / is the group ho- 
momorphism : Z — >► G defined by f(l) =x. 

Then p(G)?](G)(/) = p(G)(/(l)) = p(G)(x) = / and n(G)p(G)(x) = 
r](G)(f) = /( 1) = v =>► r](G) is an equivalence VG e Grp r\ is a natural equiva- 
lence. 

(ii) Take in particular C = Grp, C = Z, T = S in Yoneda’s Lemma. Then it fol- 
lows by (i) and Yoneda’s Lemma that ( S , S) = (hz, hz) = hz(Z) = Hom(Z, Z) = 
Z, the last equivalence is obtained by assigning to the group homomorphism 
/ g Hom(Z, Z) the integer /( 1) g Z.] 
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B.5 Presheaf and Sheaf 

Let A be a topological space. A presheaf A (of abelian groups) on A is a contravari- 
ant functor from the category of open subsets of X and inclusions to the category of 
abelian groups and homomorphisms. In general, one may define presheaf with val- 
ues in an arbitrary category. Thus, if each A(U) is a ring for every open set U C X 
and for each pair of open sets of U, V (U C V ), 

pu,v • A(V) — > A(U) is a ring homomorphism (called restriction) such that 
pu u = identity and pu,vPv,w = Pu,w when U c V C W C A, then A is called 
a presheaf of rings. 

Similarly, let A be a presheaf of rings on A and suppose that B is a presheaf on 
A such that each B(U) is an A(I/)-module and pjj,v • B(V) — > B(U) are module 
homomorphisms such that pu,u = identity and pu,vPv,w = Pu,w for open sets 
U CV CW C X. Then B is said to be a presheaf of A -modules. 

If M is an abelian group, then there is the ‘constant presheaf ’ with A(U) = M 
for all U and pjjy = identity VC/ C V. We also have the presheaf B assigning to 
U the group (under pointwise addition) B(U) of all functions from U to M, where 
Pu,v is the canonical restriction. If M is the group of all real numbers, we have the 
presheaf C with C(U) being the group of all continuous real valued functions on U. 
A sheaf (of abelian groups) on a topological space A is a pair A=(S,tt), where 

(i) S is a topological space; 

(ii) 7T : S — > A is a local homeomorphism, i.e., every point a e S has an open 
neighborhood N in S such that ic \ N is a homeomorphism between N and an 
open neighborhood of jt ( a) in A; 

(iii) Each A x = for r g A, is an abelian group (and is called the Stalk of S 

over x)\ 

(iv) The group operations are continuous. 

The meaning of (iv) is as follows: Let S ® S = {(a, ft) e S x S : tv (a) = ir(p)} be 
the subset of S x S with induced topology, the map S x S — >► S defined by (a, f) —>► 
(a — ft) is continuous (equivalently, the map f : S ^ S, a ^ —a is continuous and 
the map S x S — > S’, (a, ft) — > (a + /3) is continuous). 

Similarly, we may, for example, define a sheaf of rings or a module (sheaf of 
modules) over a sheaf of rings. Thus for a sheaf of rings, each stalk is assumed to 
have the (given) structure of a ring and the map S © S — >► S, (a, f) \-> a ft is assumed 
to be continuous (in addition to (iv)). For a sheaf of R -module each stalk of S is an 
R -module and the module multiplication S — >► S defined by a rot is continuous 
for each r e R. 

The Canonical Presheaf of a Sheaf Let A.= (S,7r)bea sheaf on A. A section 
of A over an open set U C A is a continuous map s \U S such that ns :U U 
is the identity on U. 

By (iii) the set of all sections of A over U is an abelian group denoted by A(U). 
Similarly, if 1Z is a sheaf of rings, 7 Z((U)) is a ring. 
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Fig. B.6 Commutativity of 
the rectangle for a 
homomorphism of presheaves 


HU) 

A(U) ^ B(U) 


Pu,v 

y 

A(V) 


h(V) 


T u,v 

y 

BiV) 


Now we assign to each open set U of X the group A(U) of sections of the sheaf 
A over U, where A(U) is understood to be the zero group if U = 0. If V C U, 
define 


p uv :A(U)^A(V) (B.10) 

to be the homomorphism which assigns to each section of A over U, its restriction 
to V (if V = 0, we put p uv =0). The assignment 

U i-> A(U ) for open sets U C X defines a presheaf {A(U), p uv ) on X. This 
presheaf is called the canonical presheaf of the sheaf A = (S, n) on the presheaf 
of sections of A. 

The Sheaf Generated by a Presheaf Let A be a presheaf on X. For each open 
set U C X, consider the space U x A([/), where U has the subspace topology and 
A(U) has the discrete topology. Form the disjoint union E = U(VcX)(^ x ^(^0)- 
We consider the following equivalence relation p on E. 

If ( x,s ) g U x A(U) and (y,t) c V x A(V), then (x, s)p(y, t) O (x = 
y and 3 an open neighborhood W of x with W CU HV and pu,w(s) = Pv,w( 0 )- 
Let A be the quotient space E/p and n : A — > X be the projection induced by 
the map p : E — > X, (x, s) \-> x. Then n is a local homeomorphism. Clearly, 
A = Tt~ l (x) is the direct limit of A(U) for U ranging over the open neighbor- 
hoods of v. Thus the stalk A x has a natural group structure. Clearly, the group 
operations in A are continuous (since they are in E). Thus A. is a sheaf called the 
sheaf generated by the presheaf A. 

Homomorphisms of Presheaves and Sheaves We mainly consider presheaves 
or sheaves over a fixed base space X. A homomorphism h : A — B of presheaves 
is a collection of homomorphisms hu : A (U) B(U ) commuting the restrictions 
i.e., making the diagram in Fig. B .6 commutative for V C U C X. p uv and z u v are 
defined by (B.10) in Fig B. 6 . Then h is a natural transformation of functors. 

A homomorphism f : A — >► A! of sheaves A = (7, re) and A' = (7 / , n f ) on X 
is a continuous map / \Y — >► 7 r such that f(A x ) C A' x Vx e X and the restriction 
fx-A x —> A' x of / to stalks is a homomorphism Vi el. 

A homomorphism of sheaves induces a homomorphism of their corresponding 
canonical presheaves. 

Conversely, let h : A — >► 5 be a homomorphism of the presheaves. For each x g 
X, h induces a homomorphism h x \ A x — lim^t/) A(U) -> lim^gfy) £(f/) = 6 *, 
and therefore, a map 77 : A — > B. If s G A(U), then h maps the section 6(s) g A(U) 
onto the section 0(h(s)) e B(U). 
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B.6 Exercises 

Exercises 

1. If a : A -> B is a retraction and also a monomorphism in a category C, prove 
that a is an isomorphism (Dual result of Proposition B.2.2). 

[Hint, a is a retraction =^3a monomorphism f \ B ^ A such that a/3 = 1 # . 
Again a (/3a) = (a/3)a = l^a = qTa =>► fa = Ia (as a is a monomorphism).] 

2. Show that if a : A —> B is an epimorphism in 5, then a 0 e 5° is a monomor- 
phism. 

3. (a) Let T be a functor from a category C to a category D. Show that T maps 

equivalences in S to equivalences in V. 

(b) Show that the equivalences in the category 

(i) S are bijections; 

(ii) Grp are isomorphisms of groups; 

(iii) Ring are isomorphisms of rings; 

(iv) Mod# are isomorphisms of modules; 

(v) Top are homeomorphisms of topological spaces. 

4. Let A be a subspace of a topological space X and / : A Y continuous. Then 
/ is said to have a continuous extension F : X Y iff F o i = /, where i : 
A ^ X is the inclusion map. 

Let T be a covariant (or contravariant) functor from the category of topo- 
logical spaces and continuous maps to a category S. Show that a necessary 
condition that a map / : A — > Y be extendable to X is that 3 a morphism 

<p:T(X)^ T(Y ) (or <p : T (Y) — > T(X)) 


in S such that 


4>oT(i) = T(f) (or T (/) = T (i) o 0). 

5. Let R" be the Euclidean n-space, with \\x || = E n = n-ball = [x e R" : 

|| jv || < 1} and S n ~ l = (n — l)-sphere = {x e K n : ||x|| = 1}. 

Prove the following: 

(a) the identity map 1 s n • S n —> S n cannot be extended to a continuous map 
E n+x ^S n \ 

(b) Brouwer Fixed Point Theorem Any continuous map f : E n+l E n+l 
has a fixed point ( i.e ., /(x) = x for some x e E nJrl ). 

[Hint, (a) If possible, let / : E nJrl S n be a continuous extension of 
1^. Then / o i = l n s (see Exercise 4). Assume that H n is the homology 
functor such that H n (S n ) = Z and H n {E nJrl ) = 0. 

Use H n onfoi = 1” and obtain H n (S n ) //„(£"+') //„(5 n ) 

such that the composite homomorphism is the identity homomorphism of 
H n (S n ) which is not possible as the composite homomorphism Z — >► 0 
Z cannot be identity. For n — 1 , the result (a) can be proved in a similar 
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way by using the result that tt\ ( S l ) = Z and tt\ ( E 2 ) = 0 (see Sect. 2.10 of 
Chap. 2). 

(b) Let / : E nJrl -> £ w+1 be a continuous map. If possible / has no 
fixed point (i.e. f{x)f^x for any x g E nJrl ). Then for any x G E nJrl join 
/(x) to x by a line and move along the line in the direction from /(x) 
to x until the unique point r(x) G S n is reached. Then r : E n+X -> S n is 
a continuous map extending 1 sn, which contradicts the result of (a). The 
Brouwer fixed point theorem for dimension 2 as stated in (b) can be proved 
in a similar way by using the result that 7Ti (.S 1 ) = Z and 7t\{E 2 ) = 0 (see 
Sect. 2.10 of Chap. 2).] 

6. Let S be a category and ( X , F) be the set of morphisms from X to Y in S. 
By keeping X fixed and varying Y, show that this set is an invariant of the 
equivalent sets in the sense that there is a bijective correspondence between the 
sets corresponding to the equivalent sets. Find the corresponding result when X 
is varied and Y is kept fixed. 

[Hint. Let Y ~ Z. So there exist / : Y —> Z and g : Z —> Y such that g o / = 
1 y and / o g = l z . Then /* : ( X , Y) -> ( X , Z) defined by /*(a) = / o a and 
g * : (. X , Z) -> (. X , F) defined by g*(/3) = g o f are such that (g o /)* = g* o 
/* = identity and /* o g^ = identity. Consequently /* is a bijection.] 

7. Let X and F be objects of a category S and let g : X — >► F be a morphism in 
5. Show that there is a natural transformation g* from the co variant functor 
hy to the covariant functor hx and a natural transformation g* from the con- 
travariant functor h x to the contravariant functor h Y . Further show that if g is 
an equivalence in S , both these natural transformations g* and g* are natural 
equivalences. 

8. Let A and C be objects of a category C. Using the Yoneda Lemma, show that 
(hc,h A )x>C(A,C). 

[Hint. Take T = h a- Then by Yoneda’s Lemma {he, Ha) = h A {C) = 
C(A, C).] 

9. Prove the dual result of Theorem B.4. 1 : 

Let C be any category and T a contravariant functor from C to S. Then for 
any object C e C, there is an equivalence 6 : {h c , T) — > T (C) such that 0 is 
natural in C and T . 

10. Let Htp denote the category of pointed topological spaces and homotopy classes 
of their base point preserving continuous maps and Grp be the category of 
groups and their homomorphisms. 

(a) If P is an //-group, show that there exists a contravariant function tt p from 
Htp to Grp. Further show that a homomorphism a : P — > P' between H- 
groups induces a natural transformation a* : n p — > tt p . 

(b) If P is a topological group, show that Tt p is a contravariant functor from 
Htp to Grp. 

(c) 7t\ is a covariant functor from Htp to Grp. 

Proof (a) Using Theorems 2.9. 1-2.9. 3, Chap. 2, define it p : Htp Grp, such 
that t t p (X) = [X; P] and also for / : Y F in Htp, 7t p (f) = f* : [F; P] -> 
[X; P] by f*\h] = [h o /], then n p is a contravariant functor. 
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Again, for [g] e [Y ; P], f*a*[g] = f*[aog] = [(aog)of] and a*/*[g] = 
a*[g ° /] = [a o (g o /)] /* o a* = a* o /*. Consequently, a* is a natural 

transformation. 

(b) If gi , g2 • X -> P are base point preserving continuous maps, then g\g 2 : 
X -> P is defined by (g\g2)W = giWgiW Vv e X , where the right hand 
side is the group product in P . The law of composition carries over to give an 
operation on homotopy classes such that [gi][g2] = [gigli- Then [ X ; P] is a 
group. Hence (b) follows from (a). 

(c) It follows from Sect. 2. 10 of Chap. 2. □ 

1 1 . Let C (X) be the ring of real valued continuous functions on a topological space 
X. Then C is a contravariant functor from the category Top of topological 
spaces and continuous functions to the category Ring of rings and homomor- 
phisms. 

12. Let Spec (R) be the spectrum space of a ring R endowed with Zariski topology 
(see Sect. 9.12). Then Spec is a contravariant functor from Ring to Top. 

13. (a) Show that all chain complexes and chain maps (see Definition 9.11 .4) form 

a category. We denote this category by Comp. 

(b) For each neZ, show that H n : Comp — > Mod is a co variant functor (see 
Definition 9.11.3). 

(c) For each n e Z, show that H n : Comp — >► Mod is a contravariant functor 
(see Definition 9.11.9). 

[Hint. For (a) and (b) use Proposition 9.11.4.] 

14. Let AB denote the category of abelian groups and their homomorphisms. 

(a) For an abelian group G, let T (G) denote its torsion group. 

Show that T : Ab -> Ab defines a functor if T (/) is defined by T (/) = 
f\T(G) for every homomorphism / in Ab such that 

(i) / is a monomorphism in Ab implies that T(f) is also so; 

(ii) / is an epimorphism in Ab does not always imply that T(f) is also so. 

(b) Let p be a fixed prime integer. Show that T : Ab — > Ab defines a functor, 
where the object function is defined by T (G) = G/pG and the morphism 
function T(f) is defined by T(f) : G/pG -> H/pH, x + pG f(x) + 
pH for every homomorphism / : G — >► H in Ab such that 

(i) / is an epimorphism in Ab implies that T (/) is an epimorphism; 

(ii) / is a monomorphism in Ab does not always imply that T(f) is a 
monomorphism. 


B.7 Additional Reading 

We refer the reader to the books (Adhikari and Adhikari 2003; Mac Lane 1997; 
Mitchell 1965; Spanier 1966) for further details. 
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Appendix C 

A Brief Historical Note 


Modem algebra began with the work of French mathematician E. Galois (181 1— 
1832) who died in a duel at a very young age. He created one of the most important 
theories in the history of algebra known as “The Theory of Galois”. Before Galois, 
algebraists had mainly concentrated on the solutions of algebraic equations. Scip- 
ione dal Ferro, Tartaglia, and Cardano solved cubic equations, and Ferrari solved 
biquadratic equations completely by radicals. The most important predecessors of 
Galois are J.L. Lagrange (1736-1813), C.F. Gauss (1777-1855) and N.H. Abel 
(1802-1829). 

Linear algebra arose through the study of the theory of solutions of linear equa- 
tions and analytic geometry. The former contributed the algebraic formulas and 
the latter geometric images. The concept of vector spaces (linear spaces) became 
known around 1920. Hermann Weyl gave a formal definition. A study of matrices, 
determinants and their closely related topics is the main objective of linear alge- 
bra. Theories of determinants, quadratic forms, linear algebraic equations and linear 
differential equations were developed in the first third of the 19th century mainly 
by J.L. Lagrange and C.F. Gauss. Gauss studied in the fifth part of ‘Disquisniones 
Arithmeticae’ in 1801, the problem of reduction of the quadratic forms with integral 
coefficients to canonical forms by using invertible substitution of the variables with 
integral coefficients. A.L. Cauchy considered in his paper “Memoir on Functions” 
just two values equal in magnitude but opposite in sign under the permutations of the 
variables contained in them and extended the investigation of C.A. Vandermonde for 
determinants of small order. He viewed a determinant as a function of n 2 variables 
which he arranged in a square table. Some years later, C.G.J. Jacobi published a se- 
ries of papers on theory of determinants and quadratic forms. The existing notation 
denoting a determinant by means of two vertical bars is introduced by Arthur Cay- 
ley, a British mathematician. His work on analytical geometry of dimension n and 
that of H. Grassmann in “The Science of Linear Extension” mark a turning point 
in the evolution of linear algebra. An important role in studying Linear Algebra is 
played by B. Riemann (1826-1866) in his famous paper “On the Hypothesis that 
lie at the Foundation of Geometry” and by F. Klein in his work on “n -Dimensional 
Space V and the Related Geometric Notion of Algebra” in 1870. On the other hand, 
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Arthur Cayley introduced the concept of matrices in 1855 and published several 
notes in Crelle’s Journal. The theory of determinants is much older than the theory 
of matrices. The German Mathematician Leibniz (1646-1716) introduced the con- 
cept of determinants in connection with the solvability of systems of linear equations 
communicated in a letter to L’ Hospital dated April 28, 1693, but the term “determi- 
nant” was coined by C.F. Gauss in 1801. Cayley is the first mathematician to realize 
the importance of theory of matrices and laid the foundation of this theory. The 
progress in linear algebra during the last few decades focuses several techniques 
like rank-factorization, generalized inverses and singular value decomposition. Vec- 
tor spaces over finite fields play a very important role in computer science, coding 
theory, design of experiments, combinatorics etc. Vector spaces over the rational 
field Q are very important in number theory and design of experiments and linear 
spaces over the complex field C are essential for the study of eigenvalues. The the- 
ory of linear spaces over an arbitrary field F is mainly developed with an eye to the 
model of linear spaces over R. 

Leonhard Euler (1707-1783) He was born in Basel, Switzerland. He spent most 
of his time in St. Petersburg and Berlin. He joined the St. Petersburg Academy 
of Sciences in 1727. He went to Berlin in 1741. He returned to St. Petersburg 
in 1766, where he remained until his death. He is one of the greatest mathemati- 
cian of the world. He published 886 papers and books. He is the inventor of the 
Euler “0 -function”. His other important contributions are “Euler’s formula”, “Eu- 
ler’s theorem”, “Eulerian angle”, “Eulerian number”, “Euler-Lagrange equation” 
etc. He represented the Konigsberg bridge problem (concerning the seven bridges 
in Konigsberg crossing the river Pregel) by a graph in which the areas are points and 
the bridges are edges. This study of the seven bridges of Konigsberg is the begin- 
ning of combinatorial topology. His fundamental work in different areas makes his 
presence everywhere in mathematics. He died in 1783. 

Joseph Louis Lagrange (1736-1813) He was born on January 25, 1736 in Turin, 
Italy. His mathematical contribution to different branches of mathematics including 
number theory, theory of equations, differential equations, celestial mechanics, and 
fluid mechanics is of fundamental importance. In 1771, he presented an extremely 
valuable memoir to the Berlin Academy ‘Reflections sur la theorie algebrique des 
equations’. In this paper he tried to prescribe a general method of solution for poly- 
nomials of degree greater than 4. He, however, failed to achieve his goal. But some 
new concepts introduced in this paper on permutations of roots stimulated his suc- 
cessors, like Abel and Galois to develop the necessary theory to find a general 
method of solution. This paper is considered to be one of the main sources from 
which modern group theory developed. In 1770, he proved his famous Lagrange’s 
theorem in group theory. His presence is felt everywhere in mathematics. He died 
on April 10, 1813. 

Carl Friedrich Gauss (1777-1855) He was born on April 30, 1777 in a poor 
family, in Brunswick, Germany. At the age of 20, he published the first proof of 
the fundamental theorem of algebra. Lagrange considered special equations, such 
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as the cyclotomic equation x n — 1 = 0. But he did not go very far. The complete 
solution of this equation by means of radicals was given by Gauss in 1801 in his 
book on number theory ‘Disquisitiones Arithmeticae’, which laid the foundation 
of algebraic number theory. A second edition of his master piece was published in 
1870 (15 years after his death). Gauss introduced the concept of congruence class 
7j n of integers modulo n , notation i for T, the term complex numbers and Gaus- 

sian integers Z [/] and studied these extensively. He was the first mathematician to 
study the structures of fields and groups and established their close relationship. 
He referred to mathematics as ‘Queen of Sciences’. Gauss did not have interest 
in teaching. He preferred his job as the Director of Observatory at Gottingen. But 
he accepted students like Dedekind, Dirichlet, Riemann, Eisenstein and Kummer. 
E.T. Bell remarked “Gauss lives everywhere in mathematics”. He coined the term 
‘determinant’ in 1801. He died on February 23, 1855. 

Augustin Louis Cauchy (1789-1857) He was born on August 21, 1789, in Paris, 
France. He came in touch with his neighbors Laplace and Bertholet in his childhood. 
He became an engineer in 1810 and started his mathematical research in 1811 with 
a problem from Lagrange on convex polyhedrons. He loved teaching. He solved the 
long standing Fermat’s problem on polygon numbers in 1812. He published more 
than 800 papers and 8 books on mathematics, mathematical physics and celestial 
mechanics. His work on mathematics covered calculus, complex functions, alge- 
bra, differential equations, geometry and analysis. The notion of continuity is his 
contribution. His treatise on the definite integral submitted to the French Academy 
in 1814 forms a basis of the theory of complex functions. He made a distinction 
between permutations and substitutions. The n variables written in any order was 
called a permutation but a passage from one permutation to another written by 2- 
row notation called a substitution was introduced by him. We now call a substitution 
a permutation. He introduced the concept of ‘group of substitutions’ which was later 
called ‘group of permutations’. Cauchy published a sequence of papers on substi- 
tutions during 1844-1846. The concepts of order of an element, a subgroup, and 
conjugates are found in his papers. He proved a theorem on a group of finite order, 
now called, Cauchy’s Theorem in his honor. His work on determinants and matri- 
ces are also important. Cauchy extended the investigation of C.A. Vandermonde for 
determinants of small order. He viewed a determinant as a function of n 2 variables 
which he arranged in a square table. He died on May 22, 1857. 

Niels Henrik Abel (1802-1829) He was born on August 5, 1802, in Finnoy, Nor- 
way. While he was a school student, he was greatly influenced by his mathematics 
teacher Holmbee to read the work of Euler, Lagrange, Laplace, and Cauchy. In- 
spired by their work he began to solve the then unsolved problem of solvability of 
the quintic equation. After a long effort, he proved in 1824 that a solution of this 
problem by radicals is impossible. He then published a leaflet in French entitled 
‘Memorie sur les equations algebriques’ in 1824. While proving this impossibil- 
ity, he used some results obtained by Lagrange and Cauchy. The groups in which 
composition is commutative are now called Abelian in his honour. His work on el- 
liptic functions revolutionized the theory of elliptic functions. He moved to Paris 
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and Berlin in search of a teaching assignment. The information of his appointment 
as a Professor of Mathematics at the University of Berlin reached his home 2 days 
after his death from tuberculosis on April 6, 1829. An Abel Prize (equivalent to 
Nobel Prize) has been instituted in 2002 by Norway Government to mark his 200th 
Birth Anniversary. 

C.G.J. Jacobi (1804-1851) He published a number of papers on theory of de- 
terminants and quadratic forms. The identity introduced by him, called “The Jacobi 
Identity” is a relationship [A,[B, C]] + [ B , [C, A]] + [C, [A, B]] = 0, between three 
elements A, B, and C, where [A, B] is the commutator. The elements of a Lie al- 
gebra satisfy this identity. Jacobi’s Identity has wide applications in science and 
engineering. 

H. Grassmann (1809-1877) His work on analytical geometry for dimension n 
appeared in “The Science of Linear Extension” and marked a turning point in the 
evolution of linear algebra. 

Evariste Galois (1811-1832) He was born on October 25, 1 8 1 1 in Bour-la-Reine, 
near Paris, France. Galois twice failed at the entrance examination for the Ecole 
Poly technique. Galois presented his first paper on the solution of algebraic equa- 
tions to the Academie des Sciences de Paris in May, 1829. He also presented his 
second paper on equations of prime degree to the Academy. Both the papers were 
sent to Cauchy but were lost for ever. He presented another paper on solution of 
algebraic equations to the Academy in February, 1830 which was sent to Fourier 
by the Academy. This paper was also lost for ever. Galois published in the “Bul- 
letin des Sciences Mathematiques of Ferussac” in April 1830 an article wherein he 
announced some of his main results of the lost papers. One theorem of the paper 
states “In order that an equation of prime degree is solvable by radicals, it is nec- 
essary and sufficient that, if two of its roots are known, the others can be expressed 
rationally. This shows that the general equation of degree 5 cannot be solved by 
radicals.” He published two more papers in June, 1830 on resolutions of numeri- 
cal equations and on the structure of finite field. Galois presented to the Academy 
a new version of his memoir entitled “Memoire sur les conditions de resolubilite 
des equations par radicaux” in January, 1831. This paper was published in 1846 (14 
years after the death of Galois) by Liouville in the Journal “de mathematiques pures 
at appliques II”. An edition containing all preserved letters and manuscripts of Ga- 
lois was published by Gauthier- Villars in 1962 under the title “Ecrits et memoires 
mathematiques d’ Evariste Galois”. 

Arthur Cayley (1821-1895) He was born on August 16, 1821, in Cambridge, 
England. He became a lawyer in 1849. During his practice in the legal profession 
up to 1863, he wrote about 300 mathematical papers. He joined as Professor of 
Pure Mathematics at Cambridge in 1863 and worked there up to his death on Jan- 
uary 26, 1895. He worked on mathematics, theoretical dynamics and mathematical 
astronomy. He wrote 966 papers and one book. In 1854, Cayley published two pa- 
pers under the title “On the Theory of Groups Depending on the Symbolic Equation 
0 n = 1” in Philosophical Magazine of Royal Society , London, Vol. 7. In 1878, he 
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gave the abstract definition of groups and formulated the problem: to find all finite 
groups of a given order n , and proved the famous Cayley Theorem for any finite 
group. He introduced multiplication table for a finite group known as Cayley Ta- 
ble. He introduced the concept of matrices in 1855 and published several notes in 
Crelle’s Journal. He is the first mathematician who realized the importance of theory 
of matrices and laid the foundation work on this theory. He developed matrix the- 
ory and proved Cayley-Hamilton theorem. He is one of the earlier mathematicians 
who studied geometry of dimensions greater than 3. The existing notation denoting 
a determinant by means of two vertical bars was introduced by A. Cayley. He is 
considered as the founder of abstract group theory. 

Leopold Kronecker (1823-1891) He was bom on December 7, 1823 in Ger- 
many. After 1870, the abstract notion of “groups” was developed in several stages, 
essentially due to Kronecker (1870) and Cayley (1878). The modern definition of 
a group by axioms was given for abelian groups by Kronecker in 1870. After the 
introduction of the concept of abstract group by Cayley and Kronecker, the theory 
of groups changed its character. Earlier to them, the main problem was to determine 
the structure of permutation groups under certain conditions as well as to determine 
the structure of finite dimensional continuous groups of transformations. But after- 
wards, the main problem became: to create a general theory of the structure of ab- 
stract groups, and to determine all finite groups of a given order. E.E. Kummer was 
his mathematics teacher. Kronecker was greatly inspired by Kummer for research. 
Kronecker’ s work on algebraic number theory placed him as one of the inventors of 
algebraic number theory along with Kummer and Dedekind. Kronecker is consid- 
ered to be the first mathematician who clearly understood the wok of Galois. While 
Weierstrass and Cantor were creating modern analysis, Kronecker remarked “God 
made positive integers, all else is due to man”. This remark badly affected Cantor. 
Kronecker died on December 29, 1891. 

Bernhard Riemann (1826-1866) His famous paper “On the Hypothesis that lie 
at the Foundation of Geometry” plays an important role in studying linear algebra. 
His idea on geometry of space has made a significant effect on the development of 
modern theoretical physics. He clarified the notion of integral by defining an integral 
called Riemann Integral. 

Richard Dedekind (1831-1916) He was born on October 6, in 1831 in 
Brunswick, Germany. He was in contact with Gauss, Dirichlet, Riemann. He com- 
pleted his Ph.D. work under Gauss. He attended a series of lectures of Dirichlet on 
theory of numbers and lectures of Riemann on Abelian and elliptic functions. He 
became interested in analytic geometry and algebraic analysis, differential integral 
calculus and also in mechanics. He introduced the concept of “Dedekind cut” in 
1872. He edited the work of Gauss, Dirichlet, Riemann. He coined the term “ideal” 
and developed ideal theory and unique factorization theory. While studying ideals 
in algebraic number fields, he introduced the concepts of ascending chain condition. 
He replaced the concept the permutation group by abstract group. He loved teaching 
on Galois theory. He died on February 12, 1916. 
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Peter Ludvig Mejdell Sylow (1832-1918) He was born on December 12, 1832 
in Oslo, Norway. In 1872, he published a paper of fundamental importance in Math. 
Annalen 5, 584-594. It contained eight theorems and extended Cauchy’s result. The 
theorems in this paper are known as Sylow ’s Theorems. He is famous for his work 
on structural results in finite group theory. He died on September 7, 1918. Sylow 
and Lie prepared an edition on the complete work of Abel during 1883-1881 with 
Sylow as the main author according to Lie. 

Camille Jordan (1838-1922) He was born on January 5, 1838 in Lyons, France. 
He was an engineer but he took admission at Ecole Polytechnique in 1855 to study 
mathematics. He became professor of analysis in Ecole Polytechnique in 1876. Jor- 
dan published 120 papers in mathematics. He originated the notion of a bounded 
function. He proved his famous theorem: “A plane can be decomposed into two re- 
gions by a simple closed curve” (called Jordan curve theorem). He was greatly influ- 
enced by the work of Riemann. He used combinatorial methods to work in topology 
and introduced the concept of path homotopy. Jordan was basically an algebraist. 
He developed the theory of finite groups and its applications following Galois. He 
introduced the concept of composition series and proved the famous Jordan-Holder 
Theorem and the concept of simple groups and epimorphisms. His mathematical 
work of 667 pages “Traite des substitutions et des equations algebriques” published 
in 1870, attracted many scholars like Sophus Lie from Norway and Felix Klein from 
Germany. In the preface to his “Traite”, he acknowledged the contribution of his pre- 
decessors: Galois who invented the principles of Galois Theory, Betti who wrote a 
memoir, in which the complete sequence of Galois Theory had been rigorously es- 
tablished for the first time, Abel, Kronecker, and Cayley. He proved finiteness the- 
orems and introduced the concept of simple groups. He studied the general linear 
groups and the fields of p elements (p is a prime) and applied his work to classical 
groups to determine the structure of Galois groups of equations. He died on January 
22, 1922. 

Sophus Lie (1842-1899) He was born at Nordfjordeid, Norway, in December 
1842. He made significant joint work with C.F. Klien. They went to Paris in 1870 
and lived in adjacent rooms for 2 months. Their joint paper was published in 1871 
in Math. Annalen 4, 424-429. They considered one dimensional continuous groups. 
In later years, they moved in different directions. S. Lie developed his theory of 
continuous maps and used it in investigating differential equations and C.F. Klein 
investigated discrete groups. S. Lie is the inventor of Lie Algebra. The fundamental 
idea of his Lie theory was published in his paper in Math. Annalen 16, in 1880. 
Lie’s investigation on the integration of differential equations attracted himself to 
investigate groups of transformations transforming a differential equation into itself. 
He developed his theory of transformation groups to solve his integration problems. 
Continuous transformation groups (called Lie groups, named after Lie), commutator 
brackets and Lie algebras are essentially due to him. They have wide applications in 
quantum mechanics. He died in 1899. 
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Georg Ferdinand Ludwig Philipp Cantor (1845-1918) He was bom in March 
1845. He was a German mathematician. He originated naive set theory. He is known 
as the father of set theory, which forms a strong foundation of different disciplines 
in modern science. There are two general approaches to set theory. The first one is 
called “naive set theory” and the second one is called “axiomatic set theory”, also 
known as “Zermelo-Fraenkel set theory”. These two approaches differ in a number 
of ways. While developing mathematical theory of sets to study the real numbers 
and Fourier series, Cantor established the importance of a bijective correspondence 
between two sets and introduced the concepts of infinite sets, well-ordered sets, 
cardinal and ordinal numbers of sets and their arithmetic. He proved that the set 
of real numbers is not countable by a process known as Cantor’s diagonal process. 
This result shows the existence of transcendental numbers (which are not algebraic 
over the field Q of rational numbers) as the set A of algebraic numbers is countable. 
Transcendental numbers are precisely all the members of R of real numbers which 
are not algebraic over the field Q. He died in January 6, 1918. 

C. Felix Klein (1849-1925) He did not consider groups in his first paper on Non- 
Euclidean geometry but he considered transformation groups of invertible transfor- 
mations of a manifold in his second paper published in 1873. He defined group like 
Jordan. According to Klein, projective geometry and Euclidean geometry deal with 
properties of figures which are invariant under respective transformations. Klein-4 
group is named in honor of Klein. His work on “n -dimensional vector spaces and 
the related geometric notion of algebra’ in 1870 stimulated the study of linear alge- 
bra. Inspired by seminar lectures of Kronecker and Klein, the algebraist O.L. Holder 
(1859-1937) completed the proof of so-called Jordan-Holder theorem on composi- 
tion series, which plays a fundamental role in group theory. 

Jules Henri Poincare (1854-1912) He is a French mathematician and is known 
as the father of topology. He was bom in 1854 in Nancy, Lorraine, France. He has 
published around 300 papers and several books spread over different areas of pure 
mathematics, celestial mechanics, fluid mechanics, optics, electricity, telegraphy, 
capillarity, elasticity, thermodynamics, potential theory, quantum theory, theory of 
relativity, physical cosmology and philosophy of science. The fundamental group 
defined in Chap. 3 was invented by Henri Poincare in 1895 through his work Anal- 
ysis Situs. This group is also called Poincare’s group in his honor. This group is 
associated to any given pointed topological space. It is a topological invariant in the 
sense that two homeomorphic pointed topological spaces have isomorphic funda- 
mental groups. Intuitively, this group provides information about the basic shape, or 
holes, of the topological space. Poincare’s work in algebraic topology is mainly in 
geometric terms. The fundamental group is the first and simplest of the homotopy 
groups. Poincare is the first mathematician who applied algebraic objects in homo- 
topy theory. Historically, an idea of a fundamental group was in the study of Rie- 
mann surfaces by Bernhard Riemann, Henri Poincare, and Felix Klein. This group 
has wide applications in different areas. For example, Brouwer fixed point theo- 
rem and fundamental theorem of algebra are proved in this book and an extension 
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problem is solved by using homotopy theory and this group. His result on index of 
intersection of two subgroups of a finite group is known as Poincare’s theorem. He 
died in 1912 in Paris, France. His famous Poincare conjecture, which was one of the 
most important long standing unsolved problems in mathematics till 2002-2003. 
The conjecture is “Every simply connected, closed 3-manifold is homeomorphic 
to the 3-sphere.” Grigory Perelman, a Russian mathematician solved this problem. 
He was awarded a Fields Medal at the Madrid, Spain meeting of the International 
Congress of Mathematicians (ICM) for “his contributions to geometry and his revo- 
lutionary insights into the analytical and geometric structure of the Ricci flow”. But 
he declined to accept it. 

David Hilbert (1862-1943) He was born on January 23, 1862 in Konigsberg, 
Germany. The great mathematician Lindeman stimulated Hilbert to work on the 
theory of invariants. He proved his famous theorem known as Hilbert Basis the- 
orem which made a revolution in algebraic geometry and ring theory. Greatly in- 
fluenced by the axioms of Euclid, he proposed 21 axioms with their significance. 
While addressing the Second International Congress of Mathematicians (ICM) of 
1900 at Paris, Hilbert posed 23 interesting problems for investigation including the 
continuum hypothesis, well ordering of reals, transcendence of powers of algebraic 
numbers, Riemann hypothesis, extension of the Principle of Dirichlet and others. 
He also worked on algebraic number theory, foundation of geometry, integral equa- 
tion, calculus of variations, functional analysis and theoretical physics. He died on 
February 14, 1943. 

Amalie Emmy Noether (1882-1935) She was bom on March 23, 1882 in Erlan- 
gen, Germany. Her father, Max Noether was a famous mathematician. She started 
her mathematics career at the University of Gottingen in 1903 as a non-regular stu- 
dent, as at that time girl students were not allowed to be admitted as regular students. 
However, she was permitted in 1904 to enroll at the University of Erlangen where 
her father taught. Hilbert invited her to Gottingen in 1915. After prolonged efforts 
of Hilbert, she was appointed an associate professor and taught there from 1922 to 
1933. As she was a Jew she had to leave the university and went to USA in 1933 
because of the rise of the Nazi regime. She gave lectures and did research work 
at Bryn Mawr College. In 1921 Noether extended the Dedekind theory of ideals 
and the representation theory of integral domains and rings of algebraic numbers 
for arbitrary commutative rings satisfying ascending chain condition. These rings 
are now called Noetherian ring. Motivated by Hilbert’s axiomatization of Euclidean 
geometry, Noether became interested in an abstract axiomatic approach to ring the- 
ory. Noether developed a general representation theory of groups and algebras over 
arbitrary ground fields. Her 45 research papers are divided into four categories: 

Category 1. Group-Theoretic Foundations. 

Category 2. Non-commutative Ideal Theory. 

Category 3. Modules and Representations. 

Category 4. Representations of Groups and Algebras. 
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The basic concept of modern theory of rings came through the works of Noether 
and Artin during 1920s. 

The ascending chain condition was introduced by Noether in her theory of ideals. 
The concept of Noetherian rings is her invention. She died on April 14, 1935. 

Joseph Henry MecLagan Wedderburn (1882-1948) He was born on Febru- 
ary 26, 1882 in Forfar, Scotland. He published 38 papers from 1905 to 1928 
and a book on matrices in 1934. He also worked on the structure of algebras 
over arbitrary fields, instead of algebras over the fields of complex numbers 
or real numbers. He proved the celebrated theorem in 1905 known as ‘Wed- 
derburn theorem’ which asserts that a finite division ring is commutative. Its 
intrinsic beauty is for interlinking the number of elements in a certain alge- 
braic system and the multiplication of that system. This result appears in many 
contexts and developed a large area of research. His work on finite algebra 
made a revolution in projective geometry with a finite number of points. For 
example, his proof of the geometrical result that Desargues configuration im- 
plies Pappus configuration is of immense intrinsic beauty. He died on October 9, 
1948. 

Emil Artin (1898-1962) He was bom on March 3, 1898 in Vienna, Austria. 
Artin generalized some results of Wedderburn on algebras over fields in 1927, by 
considering rings with descending chain condition. He worked on various areas of 
mathematics such as number theory, group theory, ring theory, field theory, geomet- 
ric algebra, and algebraic topology. Artinian ring is his invention. He proved in 1927 
the general laws of reciprocity, which cover all the previous laws of reciprocity up to 
the time of Gauss. He formulated Galois theory in an abstract setting, he published 
Galois theory, as used today by establishing a connection between field extension 
and the subgroups of automorphisms. He died on December 20, 1962. 

Oscar Zariski (1899-1986) He was born in the city of Kobrin, the then part of 
the Russian Empire. In 1920, the city Kobrin fell in independent Poland as per po- 
litical agreement between Russia and Poland. He opted for the Polish nationality 
for his convenience to study mathematics. His work in the area of algebraic geom- 
etry is fundamental. The subject algebraic geometry arose through the study of the 
solution sets of polynomial equations and can be traced back to Descartes. But it 
becomes an important mathematical discipline in the 19th and 20th centuries. By 
the early twentieth century, the Italian school of algebraic geometers established 
many interesting results and investigated many basic problems of algebraic geome- 
try. Zariski joined the strong group of the Italian school and by using modern algebra 
the school reformulated the subject. Zariski introduced a topology with the help of 
algebraic sets as closed sets. This topology is now known as Zariski topology. He 
was a professor of mathematics at Harvard University and made Harvard a world 
center for algebraic geometry and formed the basis for its twentieth century devel- 
opment. 
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C.l Additional Reading 

We refer the reader to the books (Adhikari and Adhikari 2003, 2004; Bell 1962; 
Birkoff and Mac Lane 2003; Hazewinkel et al. 2011; van der Waerden 1960) for 
further details. 
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